Getting unsigned bytes in java

Welcome back after long overwork and vacation period :) Let’s start small, it’s still almost summer.

Bytes in java are always signed and are quite horrible to use because of the casting and weird side-effects (or I’m just stupid enough to call it weird).

But we need unsigned bytes quite a lot. Most of the time for reading various streams (networks, 8bit strings etc) some text stream you still end up with byte streams.

There is a nice trick converting them from signed to unsigned in case you want to get normal 0..255 range for the chars.

byte mybyte = -104;
long mynewint = mybyte & 0xFF;

Hope it helps!

I had quite a nice puzzling while trying to find some easier way. Is there?

   

  • Ellie

    Nice entry!

    Though if you have, say:
    byte mySignedByte = 0xFE; // = 254 decimal

    You can also do (pseudo code) :
    int num = 16 * ((mySignedByte & 0xf0) >> 4) + mySignedByte & 0x0f

    And of course there is the hard method:
    int num = Byte.valueOf(mySignedbyte).intValue();

    Sincerely,

    Ellie P.

  • http://ahtik.com Ahti

    thanks for the feedback!

    Not sure what is pseudocode expected to do. For 0xFE case (signed byte -2) the equation returns 14;
    Bit-shifting and multiplying is not needed to trim the sign by half-bytes.

    To fix it, mySignedByte&0xf0+(mySignedByte&0x0f) would work (equally to mySignedByte&0xff).

    Well, this hard method is behaving like it should – preserves the sign – so no help:

    “byte mySignedByte = 0xFE;” is illegal code (cannot convert from int to byte) so one should either use
    “(byte) 0xFE;” or “Integer.valueOf(0xFE).byteValue();”.

    Both give signed byte (-2 for this case).

    Next, Byte.valueOf(mySignedbyte).intValue() gives back -2, signed int.

    Byte is signed, however we tweak the bits and intValue() is and should respect the sign bit.

  • vb@vb

    You are absolutely right, Ahti – at least this is the most optimal way I could think of as well.

    Any other variations such as :
    int i = b<0 ? b+256 : b;
    are maybe simplier to read, but heavier in terms of performance.

    Now to Ellie's version. It returns 14, however because of operator precedence.
    The actual "correct" formula he meant (I think so at least) is:
    int num = 16 * ((mySignedByte & 0xf0) >> 4) + (mySignedByte & 0x0f);
    Otherwise the last &0x0f will be executed last and trim the 4 MSBs.

    It is correct btw, but the question here is why to do it, since it’s actually &0xff.
    And the pattern 16 * (X >> 4) is a nice way to obfuscate identity, kinda pseudo-security by obscurity :)

    Vadim.

  • http://ahtik.com Ahti

    Ahh yes correct. I forgot about the possibility for wrong order of operator precedence, just copied it to java to make sure.