Skip to content

A text-to-text encoding to make all characters have the same number of occurences

Notifications You must be signed in to change notification settings

foobuzz/equitext

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

equitext

A Python module which encode strings so that every character has the same number of occurrences in the encoded string. It makes the string length grow by a factor of about 1.44.

>>> import equitext
>>> message = "A histogram is a graphical representation of the distribution of numerical data. It is an estimate of the probability distribution of a continuous variable (quantitative variable) and was first introduced by Karl Pearson."
>>> equitext.histogram(message)
  ======================================================================== 0.145
i ================================================= 0.1
a ================================================= 0.1
t ============================================= 0.09
r =============================== 0.063
o ============================= 0.059
e ============================= 0.059
n =========================== 0.054
s ======================== 0.05
b =============== 0.032
u =============== 0.032
d ============= 0.027
l ============= 0.027
f =========== 0.023
h ========= 0.018
c ========= 0.018
m ====== 0.014
v ====== 0.014
p ====== 0.014
y ==== 0.009
g ==== 0.009
. ==== 0.009
) == 0.005
A == 0.005
q == 0.005
w == 0.005
I == 0.005
P == 0.005
K == 0.005
( == 0.005
>>> encoded = equitext.encode(message)
>>> encoded
' ImobKAfgh)levscPwtp.(ydarnqui mrqolb(fevItu.npgiwyasAdch)KP sdc.pqrAnmihtaovfuKegw)lbIP(y huynbqo(iswlfIr.eP)gmtAapvKcd oKyahgsmwe)ntfuIip.PlrvqAd(bc (tawq.vAcplugI)mfsndyoirbPeKh( aeifnrPvtdhqwImgpylu.cb)AKos(.PwtsIboufmipKeAahv gn)yqldrc grPKIvil(dyA)s.hcpuqtbmaeofwn oKs(c)AgP.uhevIlqarmtpwnfydbi vwmhKqlrbf()ietopysPc.IudngAa'
>>> equitext.histogram(encoded)
. ======================================================================== 0.033
m ======================================================================== 0.033
) ======================================================================== 0.033
b ======================================================================== 0.033
r ======================================================================== 0.033
( ======================================================================== 0.033
  ======================================================================== 0.033
v ======================================================================== 0.033
h ======================================================================== 0.033
A ======================================================================== 0.033
c ======================================================================== 0.033
o ======================================================================== 0.033
t ======================================================================== 0.033
w ======================================================================== 0.033
n ======================================================================== 0.033
I ======================================================================== 0.033
y ======================================================================== 0.033
s ======================================================================== 0.033
u ======================================================================== 0.033
f ======================================================================== 0.033
P ======================================================================== 0.033
p ======================================================================== 0.033
g ======================================================================== 0.033
i ======================================================================== 0.033
K ======================================================================== 0.033
e ======================================================================== 0.033
d ======================================================================== 0.033
a ======================================================================== 0.033
q ======================================================================== 0.033
l ======================================================================== 0.033
>>> equitext.decode(encoded)
'A histogram is a graphical representation of the distribution of numerical data. It is an estimate of the probability distribution of a continuous variable (quantitative variable) and was first introduced by Karl Pearson.'
  • Installation: install the equitext package for Python 3 via pip3

  • The algorithm: on my blog

  • Module documentation: here

About

A text-to-text encoding to make all characters have the same number of occurences

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages