Archive

Archive for October, 2007

Java ID3 Libraries – Too many poor choices

October 21st, 2007 jesse No comments

We’re at a stage where we need to parse media files for their metadata.

Unfortunately, the Java Media Framework has not kept pace with native tools. There are few good options for most media container formats like AVI, WMF, MP4, etc.

MP3s are the baseline, and so there should be a reasonably easy to use, well documented and stable Java library to manage this format. Actually, the number of libraries solutions is only overshadowed by their poor documentation and difficulty to use.

Most MP3 libraries are designed to actually manipulate the file. Read-only functions are a small subset. Here are a few we tried and what we found:

  • javazoom.spi
    The API is awkward for ID3 alone. Library is primarily for decoding MP3 audio. Works well. Stable & maintained.
  • org.farng.mp3
    Javadoc still very sparse. Usable but requires extensive hunting.
  • JID3
    Thorough and very powerful.
    Difficult abstraction between versions V1 and V2.x tags.
  • MyID3
    Simple, easy and reasonably well documented

While JID3 seemed to be the most comprehensive and powerful, it had a steep learning curve and is difficult to use for simple functions like reading and writing MP3 tags.

While less comprehensive, MyID3 was the clear winner for us because of is simplicity and documentation.

Categories: Java Development Tags:

Removing Control Characters (specifically ^M)

October 1st, 2007 jesse No comments

Sometimes unwanted control characters are placed into a file during a DOS to UNIX conversion. They typically show up as ^M or ^L in the vi editor.

Here are two methods to remove them:

Command line:

The translate command “tr” with the -d flag is a convenient way to clean control characters from files using the command line.

After using “cat -v” to view the file showing non-printing characters, determine which control characters are present in the file and which you wish to remove.

For example, ^C is represented by “\003″, and ^M is represented by “\015″.

Use the following command to remove ^M characters:

% tr -d “\015″ filename

Vi editor:

You may also use the vi regex substitution command to delete control characters.

To do this, use the normal search and replace context and add your control characters by preceding them with the [ctrl] + V key sequence.

For example, to replace the ^M character, type

:%s/^M//g

using the key sequence [ctrl] + [shift] + v, [ctrl] + [shift] + m to generate the ^M character.

As a cheat, you could also just remove the last character of each line using the following:

:%s/.$//

References:

The ascii table (http://www.asciitable.com) is a useful start to look up the octal representation of desired characters. Following is a brief list of Control characters:

Oct Dec Hex Name000  0 0x00 NUL001  1 0x01 SOH, Control-A002  2 0x02 STX, Control-B003  3 0x03 ETX, Control-C004  4 0x04 EOT, Control-D005  5 0x05 ENQ, Control-E006  6 0x06 ACK, Control-F007  7 0x07 BEL, Control-G010  8 0x08 BS, backspace, Control-H011  9 0x09 HT, tab, Control-I012 10 0x0a LF, line feed, newline, Control-J013 11 0x0b VT, Control-K014 12 0x0c FF, form feed, NP, Control-L015 13 0x0d CR, carriage return, Control-M016 14 0x0e SO, Control-N017 15 0x0f SI, Control-O020 16 0x10 DLE, Control-P021 17 0x11 DC1, XON, Control-Q022 18 0x12 DC2, Control-R023 19 0x13 DC3, XOFF, Control-S024 20 0x14 DC4, Control-T025 21 0x15 NAK, Control-U026 22 0x16 SYN, Control-V027 23 0x17 ETB, Control-W030 24 0x18 CAN, Control-X031 25 0x19 EM, Control-Y032 26 0x1a SUB, Control-Z
Categories: Linux Misc Tags: