Sunday, April 26, 2015

Binary to Human Readable

I spend a large chunk of my life reading files created to show what was happening in the applications I work on. These so-called log files are like the computer equivalent of the diary of an obsessive person with no memory. Anyway, every once in a while these logs have data in them that don't correspond to readable characters. And sometimes people forget to take this into account when creating the log files. I'm working with one of those now, and I've had this issue before. After finally tiring of guessing by context what the funky symbols meant, I broke down and wrote this Ruby script.


# A script to convert non-printable binary characters in a file to a
# printable and human readable format.

if ARGV.length != 1
  puts "Usage: ruby convertbintohex.rb filename [> outputfile]\n"
  exit
end

filename = ARGV[0]
File.open(filename, "rb").each_byte do |c|
  # Binary mode has to be in there because Windows
  case (c)
    when 0 then print "[NUL]"
    when 1 then print "[SOH]"
    when 2 then print "[STX]"
    when 3 then print "[ETX]"
    when 4 then print "[EOT]"
    when 5 then print "[ENQ]"
    when 6 then print "[ACK]"
    when 7 then print "[BEL]"
    when 8 then print "[BS]"
    # 9 is HT
    # 10 is LF
    when 11 then print "[VT]"
    when 12 then print "[FF]"
    # 13 is CR
    when 14 then print "[SO]"
    when 15 then print "[SI]"
    when 16 then print "[DLE]"
    when 17 then print "[DC1]"
    when 18 then print "[DC2]"
    when 19 then print "[DC3]"
    when 20 then print "[DC4]"
    when 21 then print "[NAK]"
    when 22 then print "[SYN]"
    when 23 then print "[ETB]"
    when 24 then print "[CAN]"
    when 25 then print "[EM]"
    when 26 then print "[SUB]"
    when 27 then print "[ESC]"
    when 28 then print "[FS]"
    when 29 then print "[GS]"
    when 30 then print "[RS]"
    when 31 then print "[US]"
    when 127 then print "[DEL]"
    # Extended ASCII characters print as hex values
    when 128..255 then print "[#{c.to_s(16)}]"
    else print c.chr # printable ASCII characters
  end
end

No comments: