Wednesday, August 09, 2006

Ruby : Cryptography - Substitution Cypher

Here's a Ruby class that can perform an encryption based on a alphabetical subsititution cypher and decrypt the resulting cyphertext using the same key. The key is provided as any alphabetical phrase. Symbols and case in the phrase are ignored and the order of each unique letter in the phrase is used to construct a new alphabet. Any remaining letters not in the phrase are added to the end of the new alphabet in ascending order.



Once the key is set the encryption algorithm simply subsitutes the letters in the input string for the ones in the new alphabet. Decryption is the reverse operation. There are so many permutations of the 26 letter Roman alphabet that the cyphertext is hard to crack using brute force techniques. It is easier to crack using letter and word frequency analysis methods but that's beyond the scope of this blog.



Check out the link that inspired this work.




class SubstitutionCypher

attr_reader :key, :cypherbet

def initialize

# most common 2, 3 and 4 letter words could be used to
verify english
# plain text on a brute force attack

@@common_english_words =
["of","to","in","it","is","be","as","at","so","we","he","by","o
r","on","do","if","me","my","up","an","go","no","us","am","the"
,"and","for","are","but","not","you","all","any","can","had","h
er","was","one","our","out","day","get","has","him","his","how"
,"man","new","now","old","see","two","way","who","boy","did","i
ts","let","put","say","she","too","use","that","with","have","t
his","will","your","from","they","know","want","been","good","m
uch","some","time","very","when","come","here","just","like","l
ong","make","many","more","only","over","such","take","than","t
hem","well","were"];

@key = ""
@@alphabet =
"abcdefghijklmnopqrstuvwxyz".splitStripUniqDown
@cypherbet = @@alphabet
@cypherhash = makehash(@cypherbet, @@alphabet)
@plainhash = makehash(@@alphabet, @cypherbet)

end

def key=(key)

@key = key
@cypherbet = key.splitStripUniqDown
@cypherbet.delete_if { |c| not @@alphabet.include?(c)
}
@@alphabet.each { |c| @cypherbet.push(c) unless
@cypherbet.include?(c) }
@cypherhash = makehash(@cypherbet, @@alphabet)
@plainhash = makehash(@@alphabet, @cypherbet)

end

def decode(cyphertext)

substitute(cyphertext, @cypherbet, @cypherhash)

end

def encode(plaintext)

substitute(plaintext, @@alphabet, @plainhash)

end


def crack(cyphertext)

i = 0

@@alphabet.each_permutation do |alphabet|

@cypherbet = alphabet
@cypherhash = makehash(@cypherbet, @@alphabet)
@plainhash = makehash(@@alphabet, @cypherbet)
plaintext = decode(cyphertext)
check(plaintext)
i = i + 1
puts i.to_s+" : "+plaintext if i % 1000 == 0

end

i

end

def check(plaintext)

wordcount = 0

plaintext.each(" ") do |word|

if @@common_english_words.include?(word.strip)

wordcount = wordcount + 1

end

end

if wordcount > 1

puts "Found "+wordcount.to_s+" English Words in :
"+plaintext

end

end

private

def substitute(fromtext, frombet, tohash)

totext = ""

lfromtext = fromtext.downcase

lfromtext.unpack("a" * lfromtext.length).each do |c|

cc = c
cc = tohash[c] if frombet.include?(c)
totext = totext + cc

end

totext

end

def makehash(alphabet0, alphabet1)

hash = {}
i = 0
alphabet0.each do |c|
hash[c] = alphabet1[i]
i = i + 1
end
hash

end

end



This class uses the following String extension:




class String

def splitStripUniqDown
lkey = self.downcase
lkey.unpack("a" * lkey.length).uniq.delete_if { |c|
c==" " }
end

end



Here's a usage example




cypher = SubstitutionCypher.new

cypher.key = "Barry John Northern"
puts cypher.key
puts cypher.cypherbet.join

cypher.key = "The Quick Brown Fox Jumped Over The Lazy Dogs"
puts cypher.key
puts cypher.cypherbet.join

cypher.key = "attack"
puts cypher.key
puts cypher.cypherbet.join
s = cypher.encode("this is the plain text")
puts s
p = cypher.decode(s)
puts p
cypher.check(p)

#cypher.crack(s)



As you can see, the final call to "crack" is commented out. It will usually take a very long time to find something and then only if the rearranged letters are few and limited to the start of the alphabet.


No comments: