This was originally posted to Facebook by me at 2010-09-02 23:19. It was edited by me there 2010-07-03.
Java is an alright language. There are a lot of things it does right, but there
are a few things it doesn’t.
- Distinction between classes and packages. I should be able to create
sub-classes the same way as I add classes to a package; a package should
just be an empty class. - Too many primitives. I should be able to (re-)construct more of the
language. - No preprocessor/inlines. OO isn’t an excuse for this, make me do it at the
class level (or rather, source file, not supporting `#include’ is fine). I
should at least be able to add `#define int8=byte’ like in C. This
wouldn’t be as much of an issue if all these things weren’t primitives; I
could just do “public class int8 extends byte”. (yes, I could extend the
`Byte’ class, but it wouldn’t come with all the syntactic sugar primitives
get.) - Numbers: names. Yes the names used are long-standing convention in CS.
These include some of the worst short-sighted mistakes in all of
hackerdom… because they stuck. Yet, most reasonable languages can still
support them, and sane equivalents.- byte -> int8
- short -> int16
- int -> int32
- long -> int64
- float -> float32
- double -> float64
This would easily be fixed if they weren’t all primitives (point 2), or if
I had a preprocessor (point 3). - Numbers: unsigned. How about unsigned integers (uint16)? This would be
easy to implement, if everything weren’t a damn primitive. - Give me an actual `struct’, like in C. I’m not asking for full manual
memory management, just the ability to organize a chunk of it; you can
still manage it for me. It would make serialization hellofalot
easier. - It’s inconsistent about whether it uses the system encoding or it’s
internal encoding. The String object just became worthless to anyone
wanting to do any amount of I18N. - It’s internal encoding is junk. It maps UTF-16 symbols onto the `char’
primitive, which is 16 bits.- UTF-16 is junk, use UTF-8
- With any UTF encoding you must allow for a dynamic bit-length, for
UTF-16 it’s 16-32 bits, UTF-8 is 8-32 bits
I understand how/why it arrived at the solution it uses; at the time Java
was designed, it was using UCS-2, which is a 16-bit encoding, and was
superseded by UTF-16 in 1996 with Unicode 2.0. However, this is one of
those things where you specify a new JVM version, and switch to UTF-8. You
can even leave a legacy mode in the JVM that still uses UCS-2. - Octal prefix: `0′ is used as the prefix to specify an octal literal. Any
third-grader can tell you why using a 0 as a prefix to a number is a bad
idea; the number might just have padded zeros. Let’s look at the prefix
used for hexadecimal: `0x’. This is great:- It starts with a numeric character, which means that it must be a
literal. If it started with an alphabetic character, it might be a
variable name. - The second character is a alphabetic character that is not used in
any number system that is used in computer science. This allows it
to serve a a unique identifier.
Given these reasons, let’s think of a new prefix for octal… how about
`0o’. That took literally less than 10 seconds for me to realize why `0′
sucked, and to think of a better one.* - It starts with a numeric character, which means that it must be a
All-in-all, its still better than C++
* although, writing this gave me an even
better idea, but it would break `0x<value>’ for hex, which is incredibly
common among many languages:
`<base-in-decimal>x<value>’
so octal would be `8x<value>’
and hex would be `16x<value>’
It would be incredibly understandable, and, depending on implementation allow
simple arbitrary-base literals.