Tom Talks Too Much: 2009

Wednesday, December 16, 2009

More research notes for the future

Notes not for today, but maybe tomorrow.

Catch phrase:
Physically rich intelligence and autonomy.

Research topics:
* Domestic robots.
* Learning and control of many sensors and actuators.
* Dynamic environments. Real-time control. Anytime algorithms.
* Life-long learning. Many concepts to relate and scale.

Platforms:
* World simulator. Parameterized, articulation, fluids, particles, sound and light tracing. The higher fidelity, the better.
* Miniature off-the-shelf humanoids (such as the Aldebaran Nao) with many actuators and sensors.
* Build on and in support of low-end computers, though allow scalability.

Goals:
* Open-source, liberally-licensed, productized-quality software projects.
* Publicly accessible research, both in software (WebGL?) and articles/documentation.
* Support a few grad students and as much academic freedom for them as possible.

Tuesday, December 1, 2009

Solution density

In the real world, there are so many variables to consider that we'd never get anything done if considered all the options. Do we need to check the headlights of the neighbors cars to know if we should make oatmeal or pancakes for breakfast? Does the arrangement of dust particles on the back porch matter?

We're good at focusing on the important details. Along with our experience, that allows to make quick decisions.

I'm wondering what formulations exist for intractable problems in computer science that relate to solution density in the sparse area of world configurations that actually matters. Seems perhaps more meaningful than trying to measure actual problem size.

Tuesday, November 24, 2009

Dynamic constness

I've been thinking about how 'const' (C++ style) seems like a good idea but can cause pain in practice. It leads to duplicate definitions of methods (termed a "shadow world" by Anders Hejlsberg). It is really is hard to enforce statically.

See here for some great summaries.

Personally, I think the "bit" idea might work. That is, pointers are usually aligned to 4-byte boundaries. That gives two bits to play with. One of those bits might serve well for indicating constness in a dynamic fashion.

Two different pointers to the same object. One carries const. The other doesn't. Enforcement happens at runtime. Sounds maybe doable.

I'd need to think more about the container issue, though. First thought is that any const pointer is just deeply const.

Friday, November 20, 2009

Course pre tests

I was just wondering what the value might be of testing folks at the beginning of a course on the subject matter. Some people come in with more knowledge than others.

I'm afraid that without that, it's harder to see who's actually learning. And without that, it's harder to know when teaching has been effective.

Or in other words, when evaluating learning, I'd like to subtract out those who already knew a lot.

Wednesday, November 18, 2009

Exceptions, Processes, Java, & Go

My opinions on exception/error handling is like so: I (and I believe most other programmers) don't want to spend the time to carefully analyze every possible unlikely error case as I go. Some cases should be carefully handled. But if things go bad in an unexpected fashion, I believe this is what should happen: clean up all resources safely and then crash.

Seriously. And that's what exception handling does, if you design your code right. Either nice automatic resource cleanup (such as what's common in C#, Ruby, Python, C++, or upcoming in Java 7) or somewhat manual work such as Java finally blocks can take care of cleanup during crash.

But crash? Yes. The question is the scope of the crash.

The crash should affect the current request. For a web server, yep, that's the HTTP request. For a database, it would be the current query. For a GUI, it's the current user action.

In olden Unix days, processes were designed to do one thing and do it well. Globals were less (though still somewhat) evil, since every process was sort of like its own object. Global death was also less evil. Therefore, the old-timey C 'assert' feature was somewhat reasonable. If you kill the process, it generally kills the whole pipe chain, which also effectively kills the user command-line request. Quite reasonable in most cases.

If each response to a GUI command ran as its own process (interesting thought and probably been done by someone before), then killing that process by assert wouldn't be so evil.

But most of us seem to have decided that multi-process is too heavy and slow. We don't really need all that process weight. Even to the extent of "stackless" and subthread coroutines like in Erlang, Stackless Python, and now Go. But in any case, whether threads or smaller, killing processes by assert is clearly scary. Instead of kill the "request", you crash the whole system. Bad news.

Enter exception handling, which works well enough that almost all recent programming languages use very similar models with equivalents of 'throw' and 'catch'.

What doesn't work is expecting that everyone will want to pay attention to every possible unlikely error case along the way. Checked exceptions in Java and manual error code checking in C lead to the same problem that people ignore errors. Checked exceptions lead to ignored errors? Yes. I've seen it a lot, especially in the pre-chaining days in Java. And still it's an easy thing to do today. Why? Because I don't know what to do with the checked exception, but I have to catch it. So I catch it. And um, if I'm extra lazy (and surely no one's lazy, right?), I just ignore it. Maybe I'm nice enough to log it or wrap it.

But manual error code checking is just begging to be ignored instead of likely to be ignored.

And if you ignore errors, instead of crashing the request (and cleaning up resources), you end up in an unstable, dangerous program state.

So, the fine folks behind Go (and yes, some them obviously have many more years under their belts than me) say they don't need exceptions and can't make them work with serious parallelism anyway. Nice and well, but expect most programmers to create unstable, buggy code.

You can't expect programmers to write "a good error message now", emphasis on the now. If do, expect instead that laziness will lead to extreme bugs.

Seriously, figure out how to crash a specific "request" with automatic cleanup when errors happen, or expect bugs.

Oh, and figuring out how to produce only one error report in the logs for each failure is also awesome. Those of you who've worked in complex Java software (checked exceptions everywhere) can imagine what I mean by this. Similar issues seem easy to come by in a land of extreme parallelism, if you don't draw that "request" circle well.

Tuesday, November 17, 2009

Fan language now Fantom

The Fan programming language (targeting the JVM, JavaScript, and the .NET CLR) has been officially renamed to Fantom to improve searchability.

New domain name: fantom.org

Happily, file name extensions and tool names remain unchanged.

It's a great language, and here's wishing them some good times ahead!

Monday, November 16, 2009

Replicating vs. Understanding

In reading a recent dialog in an IEEE Autonomous Mental Development newsletter, they are discussing the principles vs. black art of developmental robotics. I tend to think of it as a question:

Do you want to focus more on creating intelligent robots or on understanding the details of intelligent beings actually operate? Both are meaningful questions, but principles of self-organization are not necessarily the same as the principles used in the end by a self-organized system. These topics sometimes get blurred, I think.

But principles perhaps can be found in relation to either question. So maybe I blurred the original topic myself.

Thursday, November 12, 2009

So like Google's Go is from Mars: A first glance

Well, I heard about the new language Go this afternoon, and it sucked a bit of time that I'm really short on. First note is that Go is from Mars. Take this sample from the tutorial:


for i := 0; i < flag.NArg(); i++ {
    if i > 0 {
        s += Space
    }
    s += flag.Arg(i);
}

Personally, I've never seen curly-based syntax without parens on fors and ifs. Just doesn't seem right to me. There's some other style in the language that's new to me. It looks so strange that I'd almost think it were an April Fools joke ... if it were April 1st, I guess.

But to get past the initial shock of alien appearance, here's my take at a first glance. This is based on their docs. I haven't actually tried it out.

The good:

Pure interface, simple struct, no-inheritance, straight functions OO. If you've read other posts on my blog, you can imagine I'd love this. I'm glad someone was brave enough to try it out.

Super fast compiler. C and C++ stink in this arena.

Fast execution, supposedly near native.

Lightweight threads and message passing, a la Erlang. Scalability should be sweet.

Normal goodies like closures and type inference.

Opinionated in some things: UTF-8 source, public/private by naming convention (caps for public).

Fairly simple namespacing.

BSD licensing.

It's from Google. Google has $$$$$$. They can push things forward.

The bad:

The syntax. It's just grating to me at this point.

Too much focus on ad hoc solutions rather than general principles.

No concept for nullable pointers vs. not.

No operator overloading.

No easy interfacing to C++.

Is it cross-platform (meaning at least Windows, Mac, and Linux) yet? I'm not sure.

The ugly:

Um, no exception handling. Seriously. So, um, expect to have to manually check for errors all the time. Which means, expect average programmers to ignore lots of errors and write buggy code.

Well, I'd complain about this if there were exceptions, but there's no mechanism for automatic resource cleanup (beyond garbage collection) that I could find. Less need without exceptions, but it still seems like a glaring hole to me. Also, if the closures and
anonymous functions are good enough, maybe I'm also overstating this one.

Probably forgetting something.

Between the ugly and the fact that they aren't really production-ready yet, I think I'll just keep a watch on it for now. What I really want is language that IDE-friendly and human-friendly, compiles and runs like lightning (including fast starts and low memory usage), and instantly consumes C and C++ libraries. Oh, and doesn't make it painfully easy to ignore errors.

Wednesday, November 11, 2009

GPGPU in the browser

This winning entry in the Mozilla Labs Jetpack competition might get many of the details wrong (as examples, why part of Jetpack instead of as its own extension and how it uses Nvidia CUDA instead of a cross-vendor solution), but I'm still super excited that it has drawn attention to putting GPGPU in the browser. So, three cheers for Alexander Miltsev!

Tuesday, November 3, 2009

Inverse Programming Assignments

I was just wondering today if there was a good way to let students play the part of the computer instead of the programmer. That is, they would be given a program and be required to push and pop off the stack, change variable values, allocate objects on the heap, deal with arrays, handle conditionals and loops, choose which line to execute next, and so on.

It might be possible on paper, in board game format, or maybe more reasonable as a video game of sorts. Increasing levels of complexity (introducing new language features and machine concepts) might be nice.

Just wondering if playing the computer would help some folks with understanding the system better.

Liberally licensed embeddable web server for C or C++?

I've done some searching in the past but couldn't find a liberally licensed embeddable web server for C or C++. That surprised me. I'm so used to having multiple good options in Java (such as Jetty) or built-in web servers say in Fan or Ruby.

Two options I've looked at are libmicrohttpd (but it's LGPL which causes too much bother) and lighttpd (but it's not really designed to have an embedded option so far as I can tell, which rather surprises me). Anything else out there?

Tuesday, October 20, 2009

Member access syntax: dots, brackets, and functions

I've spent some time thinking about common alternative syntaxes for accessing object members. By members, I mean methods or fields or whatever. For example, we have the struct dot syntax:


person.age
array.length

And so on. C++ extended C to use the same syntax for method calls:


person.age() or person.getAge()
array.length()
pen.moveTo(x, y)

Languages like Eiffel, Ruby, or Fan that treat all method calls as abstract messages don't distinguish between field access and method calls.

Anyway, then we have array or hashtable access syntax, a different kind of membership:


persons[4]
properties["background"]

And languages like JavaScript unify the two; properties.background and properties["background"] mean the same thing. There's some nicety to that.

But I'm working through a series of issues that will eventually get tangled. First, I need to go back to C. I can abstract member access there, too:


age(person)
length(array)
moveTo(pen, x, y)

I'll pretend to have those namespaced for the moment (since otherwise in C they'd have uglier names), and if I have dynamic dispatch on the first (or more) arguments, this syntax could have the same meaning as virtual methods in C++ or Java or whatnot. For example, MATLAB can do OOP method dispatch with standard function call syntax.

So the syntax question becomes do you like nested function calls or postfix member dot access? I think the postfix (or somewhat infix) C++ style is easier to understand:


toUpper(firstName(person)) vs.
person.firstName.toUpper

The chain is just easier to read. And apparently the C# folks thought so enough that they introduced extension methods. A semi-complicated way of allowing dot chain syntax. Why not just make a language such that both syntaxes are equivalent?

One reservation. Here's that tangle to which I was referring. Lets look at arrays again. Why do we really need a separate array access syntax vs. function call syntax? An array is just a function that hardcodes the responses, in one way of looking at it. So, like Scala or MATLAB, we could make them the same:


persons(4) vs.
persons[4]

And that frees up brackets for other uses (like generics in Scala). But, if like JavaScript, we consider struct membership and array-ish access to be the same, then 'persons[4]' is the same as 'persons.4', and that would be the same as '4(persons)', and all the worse if it's the same as 'persons(4)'.

So, it tells member that array/hashtable lookup should best be separate syntax from function calls. I like it ('persons[4]') as shorthand for notions like 'persons.get(4)'. I think otherwise the unification of concepts leads to entanglement.

At least, I haven't worked out another solution I like better.

Friday, October 9, 2009

Teaching Ideas: Quick Syllabus

I've been thinking about teaching ideas. I'm still rather sold on the weekly, multiple choice, open book quiz thing.

I think I've also decided that the syllabus should fit in one page. I'd love to get to real content even the first lecture. And the first quiz should include questions about the syllabus.

Unrelated question: Could it possibly be easier to introduce Python and then teach Java in one semester than just to jump directly into Java? Maybe for some people. Not sure I'd be that brave (or be able to get approval) to try it out, though.

I do hope to get the chance to teach soon.

Wednesday, October 7, 2009

Tamarin Nanojit

I sometimes get a hankerin' for programming languages. And I love script-style execution and high speed. I also love fast start-up and low memory consumption. Both of those points are making me more and more shy of the JVM.

So, how do you get your hands on a well-maintained cross-platform JIT? Maybe Nanojit (used by Tamarin and therefore Flash; also used by Firefox 3.5's TraceMonkey) is a good choice. Looks small and fast. Maybe even tasty. Something to keep in mind perhaps.

Tamarin also includes a garbage collector, as a related morsel. Here's hoping it's all thread-safe.

Again, just notes for the future.

Tuesday, October 6, 2009

Parameterized Subtasks

I've been studying logical planning and hierarchical reinforcement learning. It's common to see parameterized subtasks. For example, "grab the apple" vs. "grab the box". Those could be seen as function call "grab(apple)" and grab(box)".

Some of the logical systems manage to deal with parameterized tasks really as parameters, using some strategy for plugging in possible argument values. But not all. It's easier to look through the options if you just instantiate every possible case (or at least the relevant cases???). So instead of the general "grab(x)", you deal with "grab(apple)" and "grab(box)" as individual cases.

All of the reinforcement learning algorithms I've seen so far deal with it this way (instantiating each individual case). But it's hard to read every paper. Maybe I just haven't seen the right papers yet. Anyway, depending on the representation of the arguments, it seems like you lose a lot of ability to learn general concepts by learning each of the instantiations from scratch.

I also wonder about genetic programming. Do they generate parameterized functions there? Or do they usually just generate lower-level constructs?

Friday, October 2, 2009

LDS General Conference

The 179th Semiannual General Conference of the Church of Jesus Christ of Latter-day Saints is this Saturday and Sunday (October 3-4). It's a chance to receive guidance from living prophets, apostles, and others chosen by Christ. (For those of you who aren't familiar with Mormonism, yes, I meant what I said.) Anyway, you can watch most of it online.

Sadly not available for Linux (or is there some way to view or listen to Windows Media?). We've viewed byu.tv streams on Mac and Windows before. It's usually high quality video, even if the interface on the site is quirky.

Anyway, I highly recommend tuning in if you get the chance. Don't expect raw entertainment, though. It's a rather formal occasion.

Tuesday, September 29, 2009

Yet Another Cognitive Architecture (or Web Framework)

There's nothing new under the sun.

Back when I followed java.net better, it seemed like several times a week I'd see passing announcements of new Java web frameworks. This list (on java-source.net) shows at least a few of them.

Studying AI and machine learning again, seems like I'm getting some deja vous. Everyone has another algorithm, or even full cognitive architecture (though that term itself is only used in certain circles). Apparently folks noticed this even years ago. I'd like to eat more pudding, personally.

Side note, sort of interesting to see Google suggestions for searches beginning with "yet another".

Friday, September 18, 2009

WebGL (3D graphics) in Firefox and WebKit

This news on WebGL is seriously cool. This O3D vs. WebGL concept is also interesting.

Taming C++

For certain reasons, I've been coding C++ recently. While I'd learned C++ before, this is the most consistently I've dug at it for a while.

Thing is, I don't like the std (STL) way of looking at things. I'm glad that C++ has a more standard library, and Boost tries to fit the same mold, but there's still a large diversity of style vs. the comparative uniformity in Java land or some other languages.

So, I'm going to the pain of adapting C++ to my style, wrapping various external libraries. (Trying to keep dependencies to a minimum, though.) Sort of a domain-specific view of the world, as is so common for C and C++ coding.

End result is that I've figured out how to make non-nullable opaque handle types with optional auto-disposal (simplified beyond auto_ptr) and whatnot. Some simple templates, too, but nothing crazy. Made a type similar to Scala's Option to get nullable when I want it.

I also have the build incrementally making sublibraries to enforce dependency ordering.

And Eclipse CDT is autocompleting and so on across most of the system quite nicely. Debugger's not working in Eclispe, though.

The end result isn't perfect, but it's really not that painful at the moment. Lots of tasks not solved yet, though, like IO, character set conversion, networking, etc.

Monday, September 14, 2009

Particle-Based Physics

I'm interested in rather detailed (and still high-speed) world simulation. I still don't know all what's out there to handle things for me. I have been learning Bullet Physics. What I'll want someday, and what I'm not sure Bullet provides is fluid simulation. I want air and water.

Seems like the most straightforward way to do this is to simulate particles. Reminds me of smoothed particle hydrodynamics models that I worked with a bit at LANL many years ago. Seems like if you represent matter as particles, and those particles have properties to represent how they bond with each other based on proximity and whatnot, then you can get water and air resistance and so on.

If bonds can be more or less rigid, and if you prebond some particles, you can make rigid bodies, too. 8 particles with strong, rigid bonds and zero radius could make a box. 2 particles with positive radii could make a capsule. Strong yet flexible bonds make soft bodies, like what Bullet provides. But if the bonds are still breakable (and don't easily reform), seems like you could simulate awesomeness like breaking objects into pieces or otherwise damaging them.

Sauerbraten's ragdoll physics system seems related, too.

Oh, and of course there's also a relation to finite element analysis.

The question is if you could optimize dynamic resolution and certain common cases such that the generalized system could still run at high speed. Seems like it could be super awesome if so.

Thursday, September 3, 2009

Uncharted 2 Cinema Mode

This peek at Uncharted 2 machinima support looks pretty good. Much nicer than Spore GA in many ways, but it doesn't seem to have built-in, high-level scripting support. Probably lower-level mods could cover in some ways.

I still think high-level machinima is much more the future of home animation than traditional tools like, say, Blender. I guess that's part of why the Blender Game Engine seems so important.

Got to make it easy. Simulate and automate. That's more approachable and, in the end (probably sooner than later), likelier to be more realistic than than expecting people to do all that stuff manually.

Wednesday, September 2, 2009

Unit Tests in the Core Language for Improved Duckiness

I've thought some about how you could make a language that looks ducky (no types required) but still is semi-statically typed for speed, toolability, and so on. One option is the run-time tracing option used by Psyco, TraceMonkey, and so on. That is, see what types are needed as the program runs and JIT specialized code on the fly.

Another alternative, really making things more static is to make unit tests an important part of compiling code. Sort of like how to need to give examples to C++ for template creation. It only knows what typed versions it needs to generate if you try to use those types.

For example:


def factorial(n, one) = {
  var result = one
  while (n > one) {
    result *= n
    n -= one
  }
  result
}

No types to be seen. Expected available operations ('>', '*', '-'), yes. Side note, that need to specify the "one" above is annoying, but I don't want to think of anything more clever at the moment.

Now, in my unit tests, I could say this:


assertThat(1) == factorial(1, 1)
assertThat(24) == factorial(4, 1)
assertThat(6.0) == factorial(3.0, 1.0)

If the main compiler/builder is aware of the unit tests (which I should have anyway, right?), it is aware of which types are available for the function/class/whatever in question. So, 'factorial' clearly needs both integer and floating-point (or whatever) implementations. I'm being vague on the language specifics here. Also, the main code base could also be used for inference. The unit tests just have a chance to go beyond that.

Such a system should even be able to determine which spots are co or contravariant. Not sure how many examples would be needed. I'm not even sure if this a good idea, but it's interesting to think about. Maybe some type placeholders, like templates in C++, would still be helpful.

This technique wouldn't require unit tests to be embedded in the main code. I personally tend to like the current common technique of parallel dirs. Main code here, tests over there.

Oh, and I also like screaming fast compilers. I'm afraid fanciness like this might slow things down just a tad ...

Tuesday, August 18, 2009

Learning New without Losing Old

I have lots of opinions on how to do certain things (programming, robot AI, ...). In order to really ingest and learn new ideas, I often have to let go of my preconceptions, at least temporarily. If I dive deep into new ideas, they can wash out my prior perspective. And some of that perspective probably has some good value.

So sometimes I try to correlate ideas. What's the big picture? How does what I know relate to what I'm learning? If I can figure out where they fit both fit on the big map, I can learn better and retain what's best from before.

It's just easier to either (1) hang on to old ideas without allowing new ones in or (2) just give up and focus on the new. It takes effort to assemble grand unification theories on the fly.

Monday, August 3, 2009

CMake rocks!

When I first ran across CMake being used, I thought "Oh no, another build system!" That means I need to install some lame software to build this thing.

After seeing it used some more and reading about it and thinking about alternatives, I've changed my tune.

If you need to build C or C++ (can't speak for other things), and you want to be cross platform (though I haven't tried cross-platform yet, it does seem promising), and you don't want too much pain, I currently recommend CMake. I haven't tried every alternative out there, which is why I don't give a stronger endorsement. But from what I've seen so far, CMake rocks.

And it definitely beats writing Makefiles (and I'm scared of autoconf, too, for some reason).

Friday, July 24, 2009

Spore Galactic Adventures for Machinima?

My kids earned money to buy Spore after enjoying the free Creature Creator trial. I had though, "Wow, this could be great for machinima (making movies)!", but it wasn't really all there enough. And then my kids earned enough to buy Spore Galactic Adventures. After trying that out, I was really excited about the possibilities. But it's still not all there.

Here's an example of something I could do:

Talk about low res.

And they don't have recording of dialog. And it stops recording automatically every couple of minutes or so and some features of interaction are lacking and ...

You can't play the games themselves unless you own Spore Galactic Adventures either. So all I can really export is just the movies, which are lacking.

Could someone please make a serious, general-purpose, high-quality video machinima engine, please? I'm interested in it myself, and I might take my research directions into that someday (as well as robotics and AI which I focus on more at this point). Cause I can say that Spore GA is way higher level and easier to work with than Blender (even the Blender Game Engine). I personally don't mind lower quality for easier work. I'm not a full studio.

Spore GA might still be good for storyboarding, though, even if it's still lacking for movie production.

Monday, July 20, 2009

Fan vs. Scala: Nullability (including quizzes!)

Scala's great, but I personally prefer another new statically-typed language for the JVM, called Fan. I think some others might, too. This blog post is my next in a series of some reasons why I personally prefer Fan to Scala. Here was my next claimed advantage from my original post:

5. Fan types are not-nullable by default, with the concept built into the core of the language, instead of being "Option"al.

This particular subject is interesting. I've rarely seen ClassCastExceptions in practice, but I've seens plenty of NullPointerExceptions. Without good documentation (remember path of least resistance here), it's really hard to know whether a method expects to support null pointers or not. And in a method that has been around for months or years, it's hard to analyze whether all the callers (however far up the call chain) are passing in nulls or not. It's common to get null pointer exceptions in random places in your code all the time with enough developers and a large enough code base. And the solution to random breakage is ad hoc null checks and best-guess alternative behavior. That clutters up code and leaves expectations still unclear.

So, my answer is that nulls are usually evil. Apparently, I'm not alone in this opinion. Tony Hoare, for example, called null references his "billion dollar mistake" in ALGOL W.

My opinion of the right answer for Java is "don't use null if you can avoid it" and "document your methods as whether they support null" and "throw exceptions early if you get a null when you shouldn't". That takes a lot of manual effort in Java. And then there's the primitives not-nullable vs. objects nullable distinction, which is there mostly for convenience/performance. But it leaves the language more complicated than necessary.

So, both Fan and Scala seem to feel nulls are bad, too. They both discourage them. Fan avoids them by making every type not-nullable by default:

Str a := "hello"
Str? b := "world"
b = null // okay
a = b // NullErr run-time error
Str c := null // compile-time error

Lovely, lovely, in my opinion. The same type rules apply for method parameters. Yes, you should still document what your parameters mean, but your path of least resistance leaves you safe from nulls by default. And often, I think that's what coders mean, anyway. Usually, you get NullPointerExceptions because you just presume everything's not null, without even thinking about it. At least, that's how it often seems to me.

Also in Fan, the types 'Bool', 'Float' (64-bit), and 'Int' (64-bit) can be nullable or not. When not nullable, they are called "value types" because they are stored expanded (using 'boolean', 'double', and 'int' primitives in the JVM, for example). Fan also supports autoboxing along these lines. Happily, the '==' operator will compare values for these types (and is one of the overloadable operators for your own types, though I don't recommend overloading except for 'const' types in most cases). Anyway, this strategy also allows for faster math in many cases vs. purely reference-based languages. There are still some improvements needed relative to making the value types blend more transparently into the rest of the type system, but they are in the queue. The final goal is to make it blend almost seamlessly into the rest of the null-vs.-not-null type system.

Scala also recommends against null but in a very different way. They have very distinct reference vs. value types. All value types are predefined (I think) and mostly (but not entirely) correspond to the primitive types in Java. The reference types correspond to objects in Java. All reference types are nullable, but good style says don't use null. At least, when you don't need to interoperate with Java or whatnot. So, just keep in your head not to use null. Instead, you use 'Option' (usually, as there are other alternatives, too). My Fan examples above now look like this in Scala (if I'm not making any mistakes at the moment -- already fixed a few since my original post):

var a: String = "hello"
var b: Option[String] = Some("world")
b = None
a = b.get // NoSuchElementException run-time error
var c: String = None // compile-time error

I'm ignoring 'getOrElse' at the moment (just as I ignored the elvis '?:' operator in Fan above), by the way. I'm just focusing on the basic rules. My opinions on handling nulls when you have them are beyond my current scope.

So again, Fan has a (mostly) unified type system defaulting to not-null. Scala has a branched type system supporting value types on one side and and nullable references on the other, but don't use null. I think Fan's system is simpler, and I like that it pushes not-null into the strong position.

Well, I could be done here, but I'm not.

There's another word in Java that gives the same idea as the English word "null". That word is "void". If you know Java, you know what both keywords mean, but I think the relationship is interesting, and it brings up additional exploration for Fan vs. Scala, too. The value "null" is actually out there. It's a real thing that represents "no object". On the other hand, "void" really means "nothing". It doesn't exist. You can't assign it to anything. As a type, it is an empty set. I should also mention 'Void' in Java as the reflection-friendly type corresponding to the keyword 'void'.

Scala quiz time! Instead of describing what they mean, I'll give a list of similar concepts in Scala and see if you can match them with the correct meaning. If you read above, you'll get some of these. And maybe you can guess the others. First the Scala list of terms (all related to null and void in Java):

1. Null
2. null
3. Nothing
4. Nil
5. None
6. Unit
7. ()

And here are the definitions to match them against (in a mixed order):

A. The empty list.
B. The empty set type, corresponding to the meaning of 'void' in Java. The subtype of all value and reference types in Scala. The type parameter used for the empty list.
C. The single instance of type 'Unit'. I think it's purposely not supposed to mean anything.
D. The single instance of type 'Null'. Corresponds to 'null' in Java, but don't use it that way.
E. A subtype of all reference types in Scala whose only value is 'null'.
F. The instance of type 'Option' that means the option was not chose to be 'Some' value. Use with 'Option' types instead of 'null' for direct reference use. Also a kind of empty list.
G. A value type with only one (meaningless) member. Used in place of 'void', although it has a different meaning in my opinion.

Fan quiz time! I'm leaving out empty lists here, although you can have those in Fan just like in Java. They just don't have names or meanings so intertwined with 'null' and 'void', so just like Java, I think they are less relevant in Fan. Anyway, here are the terms I think should be considered:

1. null
2. Void
3. Void?

And here are the definitions to consider:

A. A type that corresponds quite closely to 'void' in Java. An empty set with no members, and only used for return types from functions.
B. A type with just the 'null' reference as a member. Somewhat related to the 'Unit' type in Scala, but I'm not sure it has any practical use. I think I'm glad it's there (just for consistency), and maybe it has some use I haven't figured out yet.
C. Corresponds to 'null' in Java. Assignable to vars of any nullable type.

Anyway, I hope this post was sufficiently readable. I think it's an important subject, and I personally like the solution that I think is simpler, more orthogonal, and less null-friendly. (And that particular personal preference is presumably easy to guess at this point.)

Wednesday, July 8, 2009

Prioritized Reduction Engine

I've been using MATLAB a lot in the past year. Many pros and cons with it, but it has definitely had me thinking from a batch update perspective. Saying 'a + b' can be really fast if 'a' and 'b' are both huge matrices. Despite some JIT support, MATLAB is generally quite slow if you want to write your own loops. While I still haven't done any explicit GPGPU programming, I have to imagine that the MATLAB batch/parallel operation mentality would map nicely to GPGPU thinking. I'm fairly excited about the whole OpenCL thing coming up. I like small computers rather than supercomputers, but I don't mind making more efficient use of them.

Anyway, while I've sort of gotten into the groove, I'm actually interested in a much fancier form of parallel computing. See my second bullet point here (on bubbling up). This has some relationships to the whole MapReduce concept, too, but I tend to imagine it fancier than that. The scheduler (or metathinker) becomes much more sophisticated, so it can handle the most important things first.

For example, I might want to add the elements in matrix 'a' and 'b', but maybe I care more about some of the results than others. And maybe I want to use those results in a later operation, say '(a + b) * c'. And I might have different tasks going on at the same time with dynamically adjustable priorities. And there might be different ways of getting to the same answer. A map might tell me where to go when I'm driving, but if I'm close enough, maybe I can look out and see where I need to go. Whatever data is available should also drive the later steps.

So we need a way to assess the value of single components of the operation and a way to execute the most important parts at all levels in parallel, such that the most important operations are getting done first. I haven't worked out all the details, but I know what I want it to look like. For example, I should be able to say '(a + b) * c' or much fancier programs and have it all just work prioritized and parallel in batch form. The priorities would have to be registered somehow, too, but perhaps separately from the algorithm chaining.

The ability to label intermediate computations for reuse at various steps could also be nice.

I think this concept isn't far off from the parallel and concurrency frameworks being built today. It's also not far off from current autoparallelization features today in systems like MATLAB and Fortress, but none of these have the same level of sophistication in dynamical scheduling that I want to see.

My current name for such a system is a "prioritized reduction engine". This is somewhere in between algorithms (iterative/online algorithms specifically) and parallel computing. Maybe I can find things in the literature about it. Sometimes it's hard to know the right words to use when searching.

Side thought I had while writing this: I wonder if there's ever any chance of getting GPGPU-powered math engines into web browsers. I guess I can dream.

Tuesday, July 7, 2009

Fan vs. Scala: Operators

Scala is a great, statically-typed language for the JVM with many advantages over Java. Some expect it to be "the next Java". It definitely has some momentum. However, I personally prefer another new statically-typed language for the JVM, called Fan. I think some others might, too. This blog post is my next in a series of some reasons why I personally prefer Fan to Scala.

Here was my next claimed advantage from my original post:

4. In Fan, you can't invent your own <**==!!! operator. (I haven't double-checked this particular one in Scala, but I've seen some doozies.)

Personally, I like operator overloading, because I'd rather say 'a * (b + c) - d' than 'a.multiply(b.add(c)).subtract(d)'. The former is more readable to me, and apparently someone felt that way in Java since they gave operators for basic numeric types in the first place. Well, it also helps to provide the separate worlds of primitives and objects that pervades Java.

However, in C++, people used operator overloading for the purposes of evil. That is, people sometimes have used say '+' to mean things very different from addition. That might have been encouraged by the C++ standard library immediately hoisting the bit shift operators (such as '>>') to provide IO services. Still, I don't see the evidence of widespread abuse even in C++, let alone other languages with operator overloading support. But I get the concern.

Scala is different from C++ in that you can invent your own operators. (I think Haskell might be the same and maybe other languages, too.) Then there's no need to abuse '+' when you can invent '+:+' instead (or whatever). For that matter, you can use normal methods as if they were operators. For example, instead of saying 'a.add(b)', you can say 'a add b'. My concern with Scala is the easy invention of creative operators. I have less concern about the use of named methods as operators.

My concern with invented operators is that I can't pronounce them, and I don't know what they mean. If I can read method names, I can guess what they mean. If I see a new operator, I have no idea. I also have trouble remembering lots of things, if I can't pronounce them. (Side note, IDEs may become vital in this matter for Scala.) I'll dodge the English-vs.-other-languages question at the moment, but feel free to weigh in on that, if you want.

Also, just like C++ immediately encouraged changing the semantics of operators (i.e., with ">>"), Scala has encouraged the use of fancy operators. Here's a list of operators extracted from the Scala standard library (based on this list):

^ ^^ ^^^ ^? ~ ~> ~! < <~ < <= == > >= >> >>> | || ||| - -= -> -- --= :: ::: :/: :\ ! != !! !? ? / /: /% * ** \ \\ & &~ && &&& &+ % + += +: ++ ++= ++: unary_~ unary_- unary_! unary_+

Off the top of my head, I can't guess at all what ':/:' should mean. Nor '^^^'. Nor some others. Combining this with implicit type conversion can make it so tracking down the source of an operator could also be tricky. So, I hope context usually makes things clear. And then there's that IDEs to the rescue again. IDEs are a fact of life these days, though, and once Scala IDEs get good enough, that might mitigate this concern to a large degree. Hard to say for every use case. By the way, an operator ending in ':' in Scala is right-associative. Meaning that 'a :: b' means 'b.::(a)' rather than 'a.::(b)'. I know some of why they did that, but it's definitely another issue to be aware of.

Here are some other discussions that show attempts at making up interesting operators in Scala. Might be some good and some bad in there, but overall it seems like excessive complexity to me. And in my opinion, hats off to Bill (in one of the links above) for choosing a clear name for his method.

Anyway, Fan lets you use operators (since like Scala and many other languages, it tries to avoid excessive primitive vs. object distinctions), but they are effectively shorthand for method names. This technique is also used in Groovy, MATLAB, Python, and presumably other languages. I like it because it reminds you what that operator should do. Here are the operators and rules for operator overloading in Fan. So, really, saying 'a * (b + c) - d' in Fan is exactly the same thing as saying 'a.mult(b.plus(c)).minus(d)'. Personally, I like that.

Monday, July 6, 2009

.NET core spec more openly licensed than Java?

For the legally minded folks, how does this news from Microsoft compare to the licensing trouble between Sun/Oracle and Apache?

Friday, July 3, 2009

Formalized Path of Least Resistance

Just a quick note to myself (or others). I've more than once talked about how programming languages should be convenient but that the path of least resistance should encourage good programming. For example, Java's checked exceptions cause the path of least resistance to end up hiding error conditions and details. I don't like Java's checked exceptions.

It just occurred to me (though it's likely been done before by others) that you could formalize the path of least resistance when programming by approximating the gradient descent across some kind of vastly multidimensional space. By observing conditions when programming (maybe only visible in the programs produced, perhaps across time for openly accessible version control), you might be able to define a statistical distribution of cost functions used by programmers when making decisions. Hard to factor in things like deadlines, interpersonal concerns, life circumstances, and so on, but maybe some vague approximate model could be made.

Summary, hopefully in English, is that some folks like formalisms. Otherwise, they don't believe you at all. It might be possible to make a formalism to study the usability of programming languages for writing good software. Not 100% sure, though. Too many assumptions might be needed.

Fan vs. Scala: Globals and Variables

This is the next installment in my series hoping to elucidate my own preferences for Fan after having been a fan of Scala. I realize that the same thing isn't for everyone. If Scala works for you, great, and feel free to comment and/or correct any of my mistakes. For today, I'm now moving on to global variables. Here was my claim from my original post:

3. Fan doesn't encourage or even support global variables. It has an almost Erlang level of attention to concurrency. (Hopefully, people don't abuse 'object' in Scala for global vars, but I fear it's an easy trap, at least for newbies.)

This was perhaps one of my most unfair comparisons, but I still think there are interesting points to be made here.

One of the first questions is, what globals? When I speak of globals, I speak of globally accessible objects. Being in a namespace doesn't make something less global, for my present concerns. Such globals include types, packages/pods, and static functions/methods and variables/constants. Anything not "injected" (passed in from outside) is a global, really.

Here's where Scala takes an interest variation from most common languages. There are no "statics" in Scala. That's actually a nice simplification in ways. Instead of static members, you define singleton objects, like so (Scala here):

object OneAndOnly {
  val favoriteNumber = 3
  def favoriteDoubled() = 2 * favoriteNumber
}

Then you have an object accessible as 'OneAndOnly' (in whatever package), and you access its members sort of like you'd access statics in Java (or C++ or C# or Fan or such languages). See perhaps a more authoritative discussion in section 11 of this article. Singletons are also how you make applications in Scala:

object HelloWorld {
  def main(args: Array[String]) {
    println("Hello, world!")
  }
}

Here's my main complaint: I'm a big dependency injection fan, with or without a framework to do it for me. I think singletons are teh evil. I don't want them to be easier to do. I'd sort of like to see a language without static access to type names and such like, though at some point it gets in the way of convenience, and sneaky tricks behind the scenes (classloader tricks and bytecode manipulation in Java, perhaps) can still inject behavior. Lots of pros and cons floating around here. But in any case, I'll at least stick to not encouraging singletons. I'm sure you could make (constant) singleton objects in Fan the old-fashioned way, but I don't recommend it.

Here's hello world in Fan, by the way:

class HelloWorld {
  Void main() {
    echo("Hello, world!")
  }
}

I could have made 'main' static or allowed for args or whatnot, but Fan also allows simple modes like this where it instantiates a (non-singleton) HelloWorld object for you.

Anyway, where I was unfair in my original point for Fan vs. Scala was saying that Scala encourages global variables. It's not true. Scala encourages singletons, and you can have variables in those singletons, but every tutorial in Scala encourages the use of 'val' (think 'final' in Java) and 'def' (making methods) over 'var' (non-final vars or in other words, actual variables). While it's just as easy to say 'var', everyone will encourage you not to do so. Furthermore, they have lots of immutable types in the Scala standard library. That said, it might be easy to fall into the trap of using mutable vals or even just vars in your singletons. Especially if you aren't versed in the art of functional programming.

Fan, on the other hand, doesn't let you make global variables. And I'm talking your static members have to be 'const' not just 'final'. Further, closures are tracked dynamically for whether they reference non-const items, and you can't start a thread with access to mutable state from outside (Fan here, skipping into the middle of a method for convenience):

pool := ActorPool()
nums := [1, 2, 3]
a := Actor(pool) |Obj msg| {
  echo("$Time.now: $msg")
  nums[0]++ // <-- Error to reference non-const locals outside this block.
}

This nicely skirts the issues with 'final' or not for Java closures. (See more on Fan threading here, where I modified the above sample from.) And if I remember right in Fan (not double-checked right now), this is figured out at runtime. That is, some 'const classes' are known to be have const instances at compile time. But some types (such as List and Map) might have const instances or not. So a runtime check can tell whether a closure is const or not, too.

It might be possible these days to pass around non-const messages between actors in Fan. I'm not sure, but the idea would be that only one actor should own a mutable object at a time. I don't recall the details. Someone who's more expert on this should feel free to chime in.

In any case, while Scala pushed immutable, it doesn't hold you to it the same way that Fan does. And sometimes code development just flows in the path of least resistance. That least resistance should keep your code clean, in my opinion, and I think Fan is stronger here.

That said, Fan best practices do not emphasize final locals the same way that Scala does. Least resistance in Fan has all kinds of non-final mutability in your local scope (Fan here):

evenCount := 0 // Hey, look! I'm not final!
[1, 2, 3].each {
  if (it % 2 == 0) {
    evenCount++ // Okay since Fan knows you are using this in the same thread.
  }
}

Personally, I like the mixture of mutable local state but const globals and cross-thread data. Apparently some other folks like the same style. See, for example, this discussion of the Reia programming language that allows non-final vars while compiling to Erlang's BEAM virtual machine. Good stuff, in my opinion, though I can't claim to be the super expert here. Just speaking from my own experience and understanding.

Tuesday, June 30, 2009

Firefox 3.5 and Ogg

Just wanted to say thanks to the Mozilla folks for having the guts to put Ogg in the browser. I know it won't have a deep impact overnight, but I think it might have deep long-term impacts.

Project idea for the willing and able: Make the video tag work cross browser. Swap via JS to Flash (or Silverlight or Quicktime or Java or ...) if video's not supported, automatically picking a different source. And make the paired sources easier to prepare on the server.

I guess the down side is that no one distributes video themselves, due to the large size, but maybe such a project would encourage the Vimeos, Hulus, and YouTubes of the world to consider support for open video. The goal in this case would be to make it easy for them to swap to Ogg (or whatever) video when Flash is unavailable.

Bonus points to Firefox if they ever get Canvas 3D in there. Even better to have an additional integrated mode for content inside and out of a 3D scene graph. (I suppose shouldn't get my hopes too high for Bullet physics ...)

Thursday, June 25, 2009

Fan vs. Scala: Type System

It's been a while, but here's my next installment of detailed discussion of my personal reasons for preferring Fan to Scala. Here is my next claimed advantage:

2. In Fan, you don't have to figure out existential types or other complex typing. Fan will even do your casting for you in many cases.

There's a huge debate about dynamic vs. static typing. I like static typing (because of easier toolability, performance, and hints for the programmer but yes I know there are pros and cons). I don't like typing to get in my way. That means I only want to say as much as I want to say. That is, infer where possible, and sometimes I just don't care what's co or contravariant. (Oh, the blasphemy!) I also don't want the type system to tie me in knots. Sometimes static enforcement blocks me from doing something correct the way I want to do it, even if other times it catches my mistakes.

Both Fan and Scala (and C# and others) do limited type inference. Usually, you have to specify types for public methods and such. Sure, they could be inferred in some cases. Some languages do that, but I like it to be explicit. The exact limits of type inference differ among these languages, but most of the focus is on local vars. For example (in Fan):

greeting := "hello"
size := greeting.size

In Fan, ':=' is used for var initialization as opposed to the 'var' keyword in Scala or C#. You even use it when specifying an explicit type for the var, just for consistency. I like either style better than the "guess where the var is defined" behavior you get in Python or Ruby. (Actually, Python has fairly simple rules, but they can get annoying sometimes. I've heard Ruby is cleaning up their ambiguity in this arena, too.)

But, anyway, my discussion had more to do with this list of features from Scala.

1. Abstract Types
2. Existential Types (officially not recommended except for interoperability with Java, is my understanding)
3. Generics
4. Type Bounds
5. Variances
6. Views

And probably things I'm forgetting. Yes, the type system is serious in Scala. They want to you get it right. Personally, I see the need for awesome static typing in a language like OCaml that has no dynamic type safety (no casting exceptions) but still wants to be type safe and avoid security risks.

But in a language with dynamic type safety (like Java or ducky languages like Python, JavaScript, or Ruby), how often do you really see casting exceptions? If I make a mistake, it will be found (rather than opening security holes), and I don't usually make mistakes.

So, to make life easy, Fan will do your casting (up or down a hierarchy) automatically. For example (in Fan):

Obj[] items := ["something", "and", "more"]
Str message := items[0]

In Java, you'd have to make that second cast explicit, being a down cast. Welcome to "I don't need generics to have my casting done for me" land.

And while I know that Scala folks will be all over me about this feature of Fan, think first about Scala's automatic type conversion. Fan only casts for you (yay! in my book). Scala, for all its type safety will magically convert integers into strings, if you set up the right magic. I'm personally really scared of automatic variable swapping going on behind my back. It's also something I dislike about C#.

One thing that I don't like about Fan is its lack of generics (except for Lists, Maps, and Funcs). Yep, I like generics (despite my list above on Scala). I just want autocasting and arbitrary support for co and contravariance. I'm a hippie that way. I think Scala's generics go too far. So really, I'd prefer some land in between Scala and Fan on the generics front. But between the two, I'll take Fan.

Side note, Fan also supports dynamic duck typing. If you say, "myVar->something", that does a dynamic lookup rather than the static form "myVar.something". Note that this is different from duck typing support in languages such as C#, Boo, and haXe, where method calls look the same, but the variable type is declared to be dynamic. The two styles have different effects on your code. Scala shuns dynamic typing in favor of static structural typing. I'm not sure either take is fully necessary, but between the two (duck typing vs. structural), I'd prefer structural. I just don't want the rest of Scala typing with it. Sorry for being too quick to find good links, but Google can probably help out.

I'll give at least one link, though, to the official description of Fan's system.

Monday, June 15, 2009

Fan vs. Scala: Target Platforms

This is a post in a series on some reasons why I prefer Fan to Scala. I don't hate Scala, and I don't mind if you use it. Still, I'm being biased here. And I don't speak authoritatively. I'm just some guy. The creators of Fan (Brian Frank and Andy Frank) are much smarter and nicer than I am. That's why they're wise enough not to do a series of blog posts like this.

So, back to my foolishness, in my original post, I claimed this as an advantage for Fan:

1. Fan has a core goal to support Java, .NET, and JavaScript platforms.

Scala has great support for Java, and it claims support for .NET, too. I've never used Scala nor Fan on .NET nor Mono, so I can't say which handles the platform better. So I'll just pretend they are equal on this subject (lacking time to gather evidence either way).

Still, I think JavaScript is just important a platform as either Java or .NET. I don't see any reference to JavaScript on the recent poll at scala-lang.org. I have seen references to past efforts, and it seems David Pollak of Lift fame is working on some kind of Scala to JS compiler. (Another reference here.)

On the other hand, here's an example of upcoming Fan JavaScript support. And they list JavaScript as a core focus on their home page. I find that important.

Side note, obviously GWT is more mature than either of these efforts, and for serious cross-platform support (though without Java or .NET so far), see haXe.

Friday, June 12, 2009

Fan vs. Scala: My background

Well, my previous post on why I prefer Fan to Scala got some attention. It also got a few votes up and down at DZone. Also, I'm afraid I wasn't sufficiently clear in my post. I too rarely am. In any case, it seemed like it might be worth extending my discussion of the points I listed.

I ought to give some personal history first, though. I love Eclipse and what other such modern IDEs do for Java. I can navigate and fairly well tame code bases made of 1000s of files. I also love that Java performance is sometimes near C, except that I can just download and run jars. Getting stinking complex open-source C or C++ code to build correctly can really make me mad. (The download and run mentality of Windows or Mac really beats Linux in this respect, too, though maybe if Ubuntu becomes more ubiquitous, people can just target that and make life easier. Official packages are always so far behind for the few apps and libs that I really want fresh.) But back to Java, I really don't like it as a language. Checked exceptions, needing to repeat oneself, lack of closures, and so on really don't make my day. And I like to learn other languages, new and old.

I occasionally noticed Scala over the past few years. It seemed very promising. When I saw it starting to get attention a couple of years ago, I figured I'd learn it more and help be part of the momentum to see if it could overturn Java. Among other things, I submitted a game to the Java 4K game competition (jar size, not source size!) written in Scala. Scala doesn't shrink as nicely as Java, but I got it mostly working and in the right size. And I kept learning the language. I gave a brown bag presentation at my job on Scala about a year ago.

And just about the same day as my brown bag presentation, I saw posts from Cedric Beust and Stephen Colebourne about this other new language, Fan. I got to the site, looked around it, and I immediately switched my interests. I thought, hey, this is even closer to my own preferences, and it doesn't smell like you need a PhD to understand it (despite the fact that I'm now working on getting a PhD). I don't prefer every decision that had or has been made, but it's just so much closer to what I want than anything else out there with any momentum. It's statically typed with important limitations to structure programs, but it still has the feel of a scripting language. Scala might be convenient (usually), but it doesn't have that same relaxed air. And it doesn't even make some of the same guarantees you get from Fan.

So history aside, I'll try to delve deeper into my list of good things about Fan in the near future. And I'll keep the "vs. Scala" perspective, as well as "vs. Java" and maybe a few others. Nothing personal against Scala. If it's good for you, then good for you. I'm just afraid that for many folks looking at Scala, Fan is a better choice, but it's not high enough up the radar yet. Or the differences might not be clear enough. That's why I'm giving this focus.

Wednesday, June 10, 2009

Why choose Fan over Scala?

So, if Scala is the top contender for the "new Java", and it's a convenient yet statically-typed language, why bother to consider Fan? Both target the JVM, both have static typing, type inference, closures, and so on. Well, I personally think Fan is a better choice for me.

Here are some of my own top reasons to choose Fan:

1. Fan has a core goal to support Java, .NET, and JavaScript platforms.
2. In Fan, you don't have to figure out existential types or other complex typing. Fan will even do your casting for you in many cases.
3. Fan doesn't encourage or even support global variables. It has an almost Erlang level of attention to concurrency. (Hopefully, people don't abuse 'object' in Scala for global vars, but I fear it's an easy trap, at least for newbies.)
4. In Fan, you can't invent your own <**==!!! operator. (I haven't double-checked this particular one in Scala, but I've seen some doozies.)
5. Fan types are not-nullable by default, with the concept built into the core of the language, instead of being "Option"al.
6. The Fan core library is simple and straightforward. It has functional features but doesn't try to shove complexity down your throat.
7. In Fan, building and modifying nested objects is straightforward.
8. Fan package (or rather pod) namespacing is simple.
9. For Fan, I only have to follow one group on a single lovely site to keep up with the core language happenings.
10. In Fan, all the Ints and Floats are 64-bit. So you don't have to worry about choosing something smaller. And chars are just Ints (using 32-bit code points which fit easily and without worrying about sign bits in that 64-bit space). Breath easy. Oh, and you do still have easy (Big)Decimal support, too, so don't worry about that.

I'm going to be biased and omit my reasons for not choosing Fan, except one. They haven't yet hit 1.0. (Don't let the version numbering scheme fool you.) There are still some backwards-incompatible changes to be made. But the goal is to stabilize things soon.

Tuesday, June 9, 2009

Java, .NET, and ECMAScript Regex Compatibility - Expert advice wanted!

I'm trying to figure out the compatibility of regexes for Java, .NET, and ECMAScript. I did a quick skim, but I don't really have the experience to know what practical implications exist. So I'm going fishing here.

At first glance, it seems that ECMAScript 3 regexes are a subset of those for Java or .NET. Does anyone happen to know if this is correct? Any specific practical gotchas encountered?

(Side note, it seems that Perl 5 sure left its mark on the world. It made a semi-standard for half-decent regexes in the world. And I call that a good thing. Even if Perl 6 changed their own regexes again. Side note 2, even if you don't know the answers to my questions, please forward to your friends who might be experts on this matter. Many thanks!)

Wednesday, June 3, 2009

Review of Mere Christianity

Mere Christianity by C.S. Lewis

My review

Rating: 4 of 5 stars

A really good book overall. Teaches lots of practical issues related to Christianity (as a religion and as a relationship with Christ) and Christian behavior. One sample of good advice is how giving ought to hurt. For instance, if we our charitable donations don't hamper our personal desires some, then we probably aren't giving enough. On that word "charitable", there's a nice discussion of charity (Christian love) itself.

From a philosophical/logical perspective, I think sometimes Lewis claims more than he's proven. For example, I believe in right and wrong, and I believe him that it shows the existence of God. Also, that everyone, if they really think about it, can figure it out. But I don't think Lewis logically proved that there aren't alternative explanations. Still, the arguments are convincing if not watertight.

Going on a tangent, I feel there's some value in comparing Lewis's theology with that of Mormonism (my being a Mormon and all). There are several differences, but I think the most fundamental is the nature of humanity. In Mormonism, we believe that all people are begotten spirit children of God. Christ has a special status. He was/is perfect. He also had a special role to play, and we refer to him as God in that role. But we believe that the rest of mankind are also spirit children of God, not merely creatures.

However, in our fallen world, and given our fallen natures, much of the same principles apply as Lewis describes. That is, Christ's redemption brings us _back_ into the state of being God's children. The process of that redemption overlaps much with the nature of choice and grace that Lewis describes. Lewis was obviously very inspired in his doctrine, and I agree with a majority of his teachings here.

With the differences being subtle at that level, I've had to think some about what the practical effects of the difference might be. I'm not sure I have a full answer at this point, but it does create a different psychological effect. The world isn't "progressing" in the way Lewis describes. The fall itself was necessary, and the fallen world is part of the experience God wants for us. Also, Christians go back to the beginning of the world. The atonement works retroactively.

Anyway, in all it was a great book, and I'm glad I read it.

View all my reviews.

Monday, June 1, 2009

Exuberant Ctags

I've been spending more time in Vim than Eclipse recently, so I was missing the source navigation features of Eclipse. Well, I finally dug into making ctags work, so I have my basic navigation features back again (the "go to definition" kind).

Then I found out that GNU ctags doesn't seem to support local variables. Then I found Exuberant Ctags, which is working great for me (in C and C++, at least). For now, I use an alias defined like so:

alias tags='$HOME/.local/bin/ctags --c++-kinds=+l --c-kinds=+l'

I wonder if there's some way to tell Vim to update the tags for a project each time I save ...

Wednesday, May 27, 2009

Review of Beautiful Evidence

Beautiful Evidence by Edward R. Tufte

My review

Rating: 3 of 5 stars

First a comment that I read this book because of all the buzz on sparklines a few years ago.

As for my review itself: I liked the emphasis on the power of the human vision system to process large amounts of data quickly. The focus here, then, is on high information density with as much context as possible. Tufte really likes figures right next to related text, or even within the text. He likes scales on pictures, or perhaps well-known objects for context. Also, information to convey statistical significance is also considered important, and the ability to relate relationships, too. Summary: easily understood, easily available, honest, dense information is good.

I found the diatribes on PowerPoint and sculpture pedestals interesting. I did not think he presented convincing evidence against slide presentations. He could easily have handpicked so few example sources (even the dozens he had). I saw no claim against bias except a statement that they were "unbiased" selections. He chose some people claiming that slide presentations were responsible for the space shuttle Columbia disaster, including himself (if I remember correctly). Any claims of value for slide presentations were quickly dismissed by saying that important other folks found slide presentations bad.

I find it sad that he fights so hard against misrepresented information then proceeds to use diatribe, one-sided arguments, and psychological appeals with references to Soviet oppression as ways to state his case.

I think people want information summarized in many cases. Not everyone wants or should need to read a detailed report.

So maybe the better conclusion would be, "If you have a highly-visible and expensive risk of several people dying, maybe you should err on the side of caution and be willing to spend more time and money to make sure you are right." I think that's better than "PowerPoint kills people" (paraphrased by me).

I still do find it interesting to read the arguments for real tech reports, use of standard sentences and paragraphs, and so on. Also the complaint against "pitch culture". So, even though I disagree with the extremity of his position, I think there is a lot to learn here and think about.

Side note, it seems clear that he carefully laid out each page (or pair of facing pages) throughout the book with great attention to how to final physical product would look. In that sense, this book is definitely a work of art. I don't get the impression many technical books tech presentation so seriously.

View all my reviews.

Wednesday, May 20, 2009

Jetpack: Firefox extension development escapes the 80s!

Read about Mozilla Jetpack here. Totally sweet. No restart required. No painful horrors in XUL, RDF, or project setup.

Finally.

Tuesday, May 19, 2009

Open Source Real-Time Raytracer

A recent paper out of Stanford from Saxena and Ng mentioned that they could train computer vision from raytracer output and use that knowledge effectively in real images. They couldn't do the same with raster (OpenGL) output. It wasn't the main focus of the paper, but I still found it very interesting.

And I like fast. And I don't like supercomputers.

Therefore, I'm rather interested in the idea of high-speed raytracing. That's why I liked this blog post on open and closed source real-time raytracers.

I'd like to see a serious real-time-or-faster, open source world simulator someday. Something that could gradually add new simulation features with time. Real-time raytracing seems to be fundamental part of that (along with physics, audio, and so on).

Monday, May 18, 2009

The Answer's Already There

I've been thinking about robots learning how to act in the world around them. For any task, let's presume a program could be written to get the job done. How much effort to cover the task, including all the corner cases? Most solid software needs a lot of effort. The devil's in the details.

However, the details are all around us. Why use automated learning? If a strategy doesn't work, modify it. Automate the modification. This glosses over lots of the how question, and bootstrapping some answers into the system might speed things along. But why work out all the bugs for the system if the system can work out the bugs for itself?

I think the same issue can apply to many types of software, by the way.

The ability to sense the effects of actions is important in all this, too.

Thursday, May 7, 2009

Kindle DX for Textbooks?

So, I skimmed about Amazon's new Kindle DX. Larger than before. Supposedly good for textbooks and newspapers. Still gray scale.

I don't get it. Maybe a novel is fine in gray, but some things need color for full effect. Like picture books. Or textbooks.

Really. It's WAY easier to convey detailed information in color, and effective textbooks use that to their advantage. (My apologies in advance to those who can't see color.) Your product won't be effective for textbooks without color. That's my opinion.

Second, many students already carry laptops, often clunky ones. The Kindle might be sleeker, but expecting two devices (laptop and Kindle) seems a bother to me.

I just don't see this working. Give me a simple PDF or something (DRM'd or whatever). I'll get by. Really.

Monday, April 20, 2009

JFreeChart Item Rendering

So, it seems that the only way built into JFreeChart to render lines is one segment at a time, rather than making it all into one path.

And yes, this can make a different when you, say, have dashed lines with many points along the way. Thick lines can also make certain end/join types cause trouble. (Sorry, too lazy to put up example pictures at the moment.)

Is there any way to make it draw a whole line at once?

Friday, April 17, 2009

GPGPU in our Eyes

Been reading Beautiful Evidence. With all the emphasis on information density, it makes me think about our minds as parallel processing engines. But you still need to get the data in there. Eyes are a really great way to input high-resolution 5-dimensional data (2 spacial, 3 color) for most people. Way better than reading words. Well, unless the picture can be conveyed in a few words.

This also reminds me of systems like numpy or MATLAB or such things where the language is slow but if you can get data into the low-level primitives, you can chunk things fast.

On a side note, I wonder how easy it is for people (especially those who are fully blind) to learn to make sense of bas relief data by touch.

Wednesday, April 8, 2009

Mono Gets Continuations

Looks like Mono is planning to one up Java (and Microsoft's .NET) again. It's continuations this time.

Miguel seems to be all about enabling technologies. Instead, Java seems to be about philosophy.

I'm still trying to dodge Mono for now, though.

Tuesday, April 7, 2009

OOP and Object Affordances

It occurred to me in the past week that there's a relationship between object-oriented programming and the psychological concept of real-world object affordances.

In our robotics research, we've mention Gibson's focus on how an affordance is not just about the object but about the agent (person) and the object. I can't grasp a basketball with one hand, for instance, but someone else can.

This reminds me of the topic of minimal interfaces on OOP and issues like mixins vs. extension methods. You can't possibly put everything you want about an object into its class (even with mixins). But how common are certain contexts (agents)? Only the most common things (if that can be determined) should be inside the class. That's my opinion. And the language should make other extensions as easily available as possible, now depending on your real context.

Multimethods also fit into this topic.

Quick, Regular, Open Book Quizzes

I've been thinking that weekly quizzes would help teach programming (or other subjects). They'd be online and multiple choice, focusing on concepts.

They'd be automatically graded, and open book (or internet, but maybe not open friend) would be fine, too.

The idea would be to "force" people to think regularly about the concepts. In order to answer the question, you'd have to think about it at least a little.

I say not open friend, because I don't want the help to be of the form ABBDCABADC. No thinking would be involved. Or maybe just make the answers in random order and unlabeled. Still maybe say no friends, but if they at least clicked the answers themselves, the answer-as-label might still require thinking.

Example question (and multiple answer could be nice, and I had a recommendation that partial-credit answers could also be cool):

Which of the following are expressions of type int?

5
(int)Math.random()
4.5
-4 + 1
"Hello"
"Hello".length()

Things like that. Maybe 5-10 questions per quiz.

Tuesday, March 31, 2009

Canvas 3D News

Oh, I'm so happy to hear updates on the 3D web front.

Wednesday, March 25, 2009

Java, Mono, Python, and SIMD

I've been interested in Mono's direct support for SIMD operations. As far as I can tell, it does make a difference in performance. Even things like NumPy (for Python) effectively use SIMD under the covers, if I understand correctly. But, apparently, we don't need support for SIMD in Java, because HotSpot will (at some magical point) make all the pain go away.

If you can tell, I'm disappointed in the Sun response to the problem. And how hard would it be to recognize when low-level 'for' loops are parallelizable with SIMD operations? My guess is that it would be very difficult to recognize automatically compared with the ease of writing code in parallel form to begin with.

I'm rather convinced that basic, and maybe some additional advanced, matrix operations ought to be in the core JRE. After some time in MATLAB, I'm convinced it's a better level of abstraction than 'for' loop everywhere, anyway. And I like pure Java. If it requires native code outside the JRE, it's uncool. I'm also tired of having no clear path for matrix support in Java.

If scientific work (or other number crunching such as for games) takes people to MATLAB or Python or Mono or C, that leaves a huge hole for Java.

Maybe someday they'll care about this hole. Really, a matrix absraction layer allows for easier, high level coding and easier optimization. Pretty please, Sun or IBM or whatever?

Bonus points for including GPGPU implementations and abstractions.

Wednesday, March 18, 2009

Web in Our Minds

Hmm. Been thinking a lot recently about how much of our symbol handling in our minds might not be too unlike the world wide web. Not exactly the same either, though. And there need to be ties between rich physical data and any "symbols".

Also thinking that our hypothesis representation (problem reduction?) space ought to be Turing complete. Not sure that this is related to symbol handling.

Wednesday, March 11, 2009

Teaching Expressions

I've found in recent days that many students can understand expressions better when built up one at a time. For example:

int i = 5;
if (Math.sqrt(i + 1) < 4.5) { /* ... */ }

So, the idea is to work outward with types:

What's the type of expression 'i'? int
What's the type of expression '1'? int
What's the type of expression 'i + 1'? int (then converted to double)
What's the type of expression 'Math.sqrt(i + 1)'? double
What's the type of expression '4.5'? double
What's the type of expression 'Math.sqrt(i + 1) < 4.5'? boolean (required for if)

Going over this in front of students and letting them give the answers seems an effective way to teach the concepts (for some people). Showing assignment, boolean operators, and other such examples can also be useful.

Wednesday, March 4, 2009

Build Objects looks interesting

I haven't tried it out yet, but Build Objects looks like the closest thing today to my Can Has Build idea that I would like much better than either Ant or Maven.

If you are looking for a better mousetrap, Build Objects is something you might want to consider.

Bullet for Physics, I think

The more I look at and think about different options, the more I'm convinced that the physics platform with legs is Bullet. And sadly, I don't think it's JBullet. That's too bad, because I definitely prefer Java. Maybe someday, native DLLs will just run as Java transparently, but until then ...

(Side note, I'm getting less convinced on using Blender as the Bullet wrapper for simulations, but maybe someday on that, too.)

Friday, February 27, 2009

Compiling GWT in GWT?

Um, just wondered today if it would be possible to use GWT's Java-to-JS compiler to compile the GWT compiler itself? Maybe the result would be too huge anyway, but if it worked you could easily push Java compiling into the browser itself. Could be great for Java tutorials, among other things.

Thursday, February 26, 2009

Hacking Up Statistics for Fun and Profit

The more I'm learning about inference and learning, the more I'm convinced that statistics is the "correct" way. My subject goes beyond statistics, too. For example, if you want to maximize something, you take the derivative and set it to 0 (and look at the second derivative to be sure you have a max and not a min or saddle point).

In contrast to this, some machine learning is full of hacks that seemed like good ideas at the time: neural networks, decision trees, and so on. Many algorithms seem ad hoc. Now I'm being someone over-the-top in my assessment. Lots of those folks have been more formal than I ever really want to be. But it's also so removed from traditional techniques at times.

Well, my current conjecture is that if there is a "right" way to do something (and that's a strong statement in algorithms), you should still be able to see how well your "hack" approximates to it. And meanwhile, it might be fun. And, you might still do well enough but be faster or have other desirable properties absent from the right way.

Well, I'm speaking from ignorance anyway. Just some thoughts on my mind.

Friday, February 20, 2009

Can Has Build ... Again

Well, since javablogs.com was broken, and I'm not sure how to get them to push my entry through, and I actually want people to read this entry (looking to get others thinking about it if at all possible), here's another post just to link to my Can Has Build system description.

Summary, the idea is to beat out Maven for easy conventions and to beat Ant for reliability and low dependencies. As in, win each of those categories big time.

I could have missed it, but I haven't heard of any other Java build systems that work like this one.

So like, here's the link again. Please read and comment (and make it for me?) if you have the time. Thanks much.

Thursday, February 19, 2009

Can Has Build for Java

While not sleeping this morning, I thought through some ideas I'd had in the past on build systems. Ant is too manual, and coding in XML is a pain. Maven is too unreliable, and it sure requires a lot of configuration and magical incantations for a convention-based system. I want minimal dependencies, real convention-over-configuration, and a sane way to extend.

For the moment, I'm affectionately calling my (hypothetical) system "Can Has Build". Here's what a basic project directory structure looks like:

project-name/
    build.jar <-- the Can Has Build main jar (which is very small and executable)
    lib/ <-- jars needed for build and at runtime go here
    src/ <-- source folder for which the class files will go in your jar
    test/ <-- source folder for unit tests

After running the build (which by default does a clean build all, since your IDE is probably doing incremental compiling for play-as-you-go anyway; I think build systems are primarily for making a final product, so it's all right to be slow):

project-name/
    build.jar
    lib/
    out/ <-- not 100% sure what to call this ("test" and "target" both start with "t")
        project-name.jar <-- or maybe project-name-SNAPSHOT.jar
        test-results/
            ...
    src/
    test/

All you need for this to work is a JDK and "java -jar build.jar". Can Has figures out everything else on its own.

Say you want to specify a version number? Then add a "project.xml" under your project dir:

project-name/
    project.xml
    ... all else as before ...

Why XML? Because I know XML better than properties files, and I think most other devs do, too, and it allows for hierarchical data if needed, but to give an idea of what I think would go in here at first, here's an example:

<project
    name="Can Has Build"
    jar-name="build"
    version="0.0.0.0.0.0.1-SNAPSHOT"
    version-jar-name="false"
    main="canhas.build.Builder"
/>

I mean, is that so bad? I can imagine JRE version requirements and so on, too.

So, say I want custom tools in my build such as Test Tool X or Parser Generator Z or Class Bundler/Renamer Q? Or my own custom script? I envision a "build" subdir:

project-name/
    build/
        lib/ <-- jars needed just at build time
        src/ <-- source code for custom build scripts
        tools/ <-- auto-registered tools that can filter the automatic build process
    ... all else as before ...

Or something like that. Again, this is convention-over-configuration. These custom tools should get picked up automatically by "java -jar build.jar".

I've also considered conventions for grouping related projects and easy specification of dependencies between them, but I'll forgo commenting on that right now. In any case, all 3rd-party libraries would need added manually. That's the reliability thing.

So, the summary, Can Has (which doesn't exist, so feel free to make it) has the following features:

No dependence on someone else's server being up.
No configuration needed for basic builds.
No dependencies except a JDK (or just JRE if you bundle your own compiler under "build/").

Saturday, February 14, 2009

Finally Translucent Scrollbars

I'm so happy that Mozilla's Bespin has translucent scrollbars. Maybe been around before (other than in prototype ideas of my own in the past), but it sure hasn't been common, at least.

Maybe it will take a lot of use to see if it's a good idea for sure, but I'm still so happy that someone put it in a (sure to be) high profile project.

USB Microcontroller Programmers on Linux without Sudo

I was doing some microcontroller programming for the first time this week. Found out I had to 'sudo' to use 'avrdude' to write to the programmer on the USB port. Lame. After a couple hours of study learning about udev (and other worse alternatives), I put this in a file I named '/etc/udev/rules.d/85-avr.rules':

ATTRS{idVendor}=="03eb", ATTRS{idProduct}=="2104", MODE="0666"

Specifically, that's for the AVR ISP mk II (also called stk500v2).

My question is, why would I ever not want write permission to a USB device attached to my computer? Is that ever a security risk?

Wednesday, February 11, 2009

Don't use System.exit()!

Just a quick comment/plea about the evils of System.exit() (or whatever other equivalents in other languages if they are anything like Java). Really, 'exit' is like a global variable. You think you want to kill the program, but you really don't. The moment you use it anywhere, you've set up a trap that makes your software unusable by third parties. Consider these alternatives instead:

Throw an exception.
End your thread, but this is a deep subject of itself which is beyond my current scope.
For real newbies (or others) working in main() alone, just use 'return'.

Really, please, and based on real-world pain, leave exit() alone unless that's really really really really really really really what you want. The status code might be one reason, but Java really isn't so often used for writing traditional processes. If you do need that, make sure you have your "traditional process" nicely distinct from the rest of your code.

Thursday, February 5, 2009

Reinforcement Learning Competition

Looks like some great problems (ranging from "Infinite Mario" to a helicopter simulator) to work on for this 2009 reinforcement learning competition. If you have experience or want experience programming AI, this could be a bunch of fun. I'm too busy with a mountain of other things, or else I'd probably look at that octopus problem. I just love high-dimensional action spaces.

Side note, looks like they use Java in their software, but not exclusively. I haven't dug deep enough to see everything going on.

Monday, January 26, 2009

Has Git actually won?

I think one of the things hampering the world moving on from svn has been the lack of a clear successor. Both hg and git (and bzr earlier on) were getting a lot of attention which means that neither was an obvious choice. But git has been getting the air time recently. I've toyed with both and have to admit a preference to git myself, too.

The git port to windows (msysGit) has also been good.

I guess if we ever see Google Code hosting adding git support, that would be a real sign. Curious to see. (For a few objective and also some personal reasons, I'd rather use Google than GitHub, but not everyone will have the same concerns I do.)

In any case, I also find using git for local history tracking to be a nice choice. Not quite so automatic as Eclipse local history, but it allows more definite control and beats some other options.

Does anyone else have an opinion? Has git beat hg for mindshare as the successor to svn?

Friday, January 23, 2009

Are killing and government control inescapable?

My second political post here. Apologies in advance, but do we have to trade war-mongering (and big government) for baby killing (and big government)?

Maybe I'm grossly misrepresenting everyone, but I'd love to find politicians to vote for that are honest, charitable, and competent. I know that means playing tough sometimes, but I think a lot of caution is in order for certain topics. Oh, and knowing how to balance a budget would be just sweet.

And, yes, I'm in the US. I can speak in an even less informed fashion about politics elsewhere.

Monday, January 19, 2009

Maybe Blender for Robot Simulations?

In the recent past, I've been eying JBullet and/or jMonkeyEngine and jME Physics for robot simulation. I think these could work, but I'm starting to like the idea of Blender instead. Blender is an opaque interface for newbies like me but still a lower barrier to entry for so many features than what I find trying to learn jME. At least for me right now. And I can do Python, too.

One concern I have with Blender is the GPL license, but that's not an immediate concern. I do love that I can download a binary easily on different platforms. And it might be interesting to make a network (or even web) server driven system with it, anyway. Easy way to allow for different languages, too.

Here's a nice detailed investigation into the use of Blender for robotics, by the way.

(Oh, and if you've heard of it, I might like Gazebo more if they distributed binaries and/or didn't try to make it so hard to build. Talk about having a billion dependencies on the latest version of everything.)

Monday, January 12, 2009

Use amsmath, not eqnarray

Being bothered in LaTeX by my difficulties with eqnarray, I searched a bit and found the recommendation to use amsmath instead, which generally already comes built in. I guess I'll give it a try.

LaTeX sure is a complex ecosystem. I like to imagine having better alternatives, but I haven't yet seen any. Well, someday, I may try to stick to pure web environments for most cases. Do I really care about ensuring sweet ligatures and so on? We read tons of web pages all the time. There've got to be tools for LaTeX-to-MathML (or images), but for actual conference and journal papers, I have a hard time imagining ever switching from LaTeX. Though, really, I'd love to see a conference or journal emphasize HTML publishing over PDF someday. I think it's possible.

Just that knowing HTML well and knowing LaTeX/PDF well seems like a lot of overhead. I'd like to limit the number of things I need expertise in. Maybe someday.

Wednesday, January 7, 2009

haXe to C++ Compiler

Um, compiling haXe to C++ (in addition to SWF, JS, and PHP) just might make it a killer platform. Maybe I should invest some time in it.

Tuesday, January 6, 2009

Java Open Sourced and Abandoned?

I'm tempted to read "instead of producing JDK7 we did JDK6u10 and JavaFX" as "Sun open sourced Java SE and then abandoned it to work on proprietary products (with some pure GPL side effects)". I think that's a bit too cruel, but it's also somewhat true.

The interesting thing is that few people have cared too much. I guess most Java is used in enterprise settings where stability is favored over bleeding edge. The Java community at large is very conservative when it comes to updates and changes. And we're used to just going along with Sun, too, I think (for core Java SE, at least). I think many other projects out there would have been forked by now if something like this happened.

Anyway, hopefully things get back on track some day (with nice licensing for the plug-in, JavaFX, and so on and/or progress on OpenJDK). I mean, Flash/Flex is about as open as JavaFX right now, if I understand things correctly. (And if you really want open, watch HTML 5, WebKit, Mozilla, and such.) Why not just go to the market leader?