+Copyright (c) 2011-2012 Tiejun Cheng
+Permission is hereby granted, free of charge, to any person
+obtaining a copy of this software and associated documentation
+files (the "Software"), to deal in the Software without
+restriction, including without limitation the rights to use,
+copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the
+Software is furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+FSelector: a Ruby package for feature selection and ranking
+**Git**: [https://github.com/need47/fselector](https://github.com/need47/fselector)
+**Author**: Tiejun Cheng
+**Email**: [need47@gmail.com](mailto:need47@gmail.com)
+**Copyright**: 2011-2012
+**License**: MIT License
+**Latest Version**: 0.1.0
+**Release Date**: March 1st 2012
+FSelector is an open-access Ruby package that aims to integrate as many
+feature selection/ranking algorithms as possible. It enables the
+user to perform feature selection by either a single algorithm or by an
+ensemble of algorithms. Below is a summary of FSelector's features.
+Feature List
+**1. available algorithms**
+ algorithm alias feature type
+ -------------------------------------------------------
+ Accuracy Acc discrete
+ AccuracyBalanced Acc2 discrete
+ BiNormalSeparation BNS discrete
+ ChiSquaredTest CHI discrete
+ CorrelationCoefficient CC discrete
+ DocumentFrequency DF discrete
+ F1Measure F1 discrete
+ FishersExactTest FET discrete
+ GiniIndex GI discrete
+ GMean GM discrete
+ GSSCoefficient GSS discrete
+ InformationGain IG discrete
+ MatthewsCorrelationCoefficient MCC, PHI discrete
+ McNemarsTest MNT discrete
+ OddsRatio OR discrete
+ OddsRatioNumerator ORN discrete
+ PhiCoefficient Phi discrete
+ Power Power discrete
+ Precision Precision discrete
+ ProbabilityRatio PR discrete
+ Random Random discrete
+ Recall Recall discrete
+ Relief_d Relief_d discrete
+ ReliefF_d ReliefF_d discrete
+ Sensitivity SN, Recall discrete
+ Specificity SP discrete
+ PMetric PM continuous
+ Relief_c Relief_c continuous
+ ReliefF_c ReliefF_c continuous
+ TScore TS continuous
+**2. feature selection approaches**
+ - by a single algorithm
+ - by multiple algorithms in a tandem manner
+ - by multiple algorithms in a consensus manner
+**3. availabe normalization and discretization algorithms for continuous feature**
+ algorithm note
+ --------------------------------------------------------------------
+ log normalization by logarithmic transformation
+ min_max normalization by scaling into [min, max]
+ zscore normalization by converting into zscore
+ equal_width discretization by equal width among intervals
+ equal_frequency discretization by equal frequency among intervals
+ ChiMerge discretization by ChiMerge method
+**4. supported input/output file types**
+ - csv
+ - libsvm
+ - weka ARFF
+ - random (for test purpose)
+To install FSelector, use the following command:
+ $ gem install fselector
+**1. feature selection by a single algorithm**
+ require 'fselector'
+ # use InformationGain as a feature ranking algorithm
+ r1 = FSelector::InformationGain.new
+ # read from random data (or csv, libsvm, weka ARFF file)
+ # no. of samples: 100
+ # no. of classes: 2
+ # no. of features: 10
+ # no. of possible values for each feature: 3
+ # allow missing values: true
+ r1.data_from_random(100, 2, 10, 3, true)
+ # number of features before feature selection
+ puts "# features (before): "+ r1.get_features.size.to_s
+ # select the top-ranked features with scores >0.01
+ r1.select_data_by_score!('>0.01')
+ # number of features before feature selection
+ puts "# features (after): "+ r1.get_features.size.to_s
+ # you can also use multiple alogirithms in a tandem manner
+ # e.g. use the ChiSquaredTest with Yates' continuity correction
+ # initialize from r1's data
+ r2 = FSelector::ChiSquaredTest.new(:yates, r1.get_data)
+ # number of features before feature selection
+ puts "# features (before): "+ r2.get_features.size.to_s
+ # select the top-ranked 3 features
+ r2.select_data_by_rank!('<=3')
+ # number of features before feature selection
+ puts "# features (after): "+ r2.get_features.size.to_s
+ # save data to standard ouput as a weka ARFF file (sparse format)
+ # with selected features only
+ r2.data_to_weka(:stdout, :sparse)
+**2. feature selection by an ensemble of algorithms**
+ require 'fselector'
+ # use both Information and ChiSquaredTest
+ r1 = FSelector::InformationGain.new
+ r2 = FSelector::ChiSquaredTest.new
+ # ensemble ranker
+ re = FSelector::Ensemble.new(r1, r2)
+ # read random data
+ re.data_from_random(100, 2, 10, 3, true)
+ # number of features before feature selection
+ puts '# features before feature selection: ' + re.get_features.size.to_s
+ # based on the min feature rank among
+ # ensemble feature selection algorithms
+ re.ensemble_by_rank(re.method(:by_min))
+ # select the top-ranked 3 features
+ re.select_data_by_rank!('<=3')
+ # number of features before feature selection
+ puts '# features before feature selection: ' + re.get_features.size.to_s
+ **3. normalization and discretization before feature selection**
+ In addition to the algorithms designed for continous feature, one
+ can apply those deisgned for discrete feature after (optionally
+ normalization and) discretization
+ require 'fselector'
+ # for continuous feature
+ r1 = FSelector::BaseContinuous.new
+ # read the Iris data set (under the test/ directory)
+ r1.data_from_csv(File.expand_path(File.dirname(__FILE__))+'/iris.csv')
+ # normalization by log2 (optional)
+ # r1.normalize_log!(2)
+ # discretization by ChiMerge algorithm
+ # chi-squared value = 4.60 for a three-class problem at alpha=0.10
+ r1.discretize_chimerge!(4.60)
+ # apply Relief_d for discrete feature
+ # initialize with discretized data from r1
+ r2 = FSelector::ReliefF_d.new(r1.get_sample_size, 10, r1.get_data)
+ # print feature ranks
+ r2.print_feature_ranks
+FSelector © 2011-2012 by [Tiejun Cheng](mailto:need47@gmail.com).
+FSelector is licensed under the MIT license. Please see the {file:LICENSE} for
+more information.
+# make a ruby gem
+task :default => :gem
+task :gem do
+ Gem::Builder.new(eval(File.read('fselector.gemspec'))).build
+# test example
+require 'rake'
+require 'rake/testtask.rb'
+task :test do
+ Rake::TestTask.new do |t|
+ t.libs = ['lib']
+ t.test_files = FileList['test/test_*.rb']
+ t.verbose = true
+ end
\ No newline at end of file
Class: Array
+ — Documentation by YARD 0.7.5
+ Class: Array
+ Inherits:
+ Object
+ show all
+ Defined in:
+ lib/fselector/util.rb
add functions to Array class
+ Instance Method Summary
+ (collapse )
Instance Method Details
+ - (Float ) ave
+ Also known as:
+ mean
+ # File 'lib/fselector/util.rb', line 14
+def ave
+ self . sum / self . size
+ - (Float ) sd
standard deviation
+ # File 'lib/fselector/util.rb', line 32
+def sd
+ Math . sqrt ( self . var )
+ - (Float ) sum
+ # File 'lib/fselector/util.rb', line 7
+def sum
+ self . inject ( 0.0 ) { | s , i | s + i }
+ - (Object ) to_scale (min = 0.0, max = 1.0)
scale to [min, max]
+ # File 'lib/fselector/util.rb', line 38
+def to_scale ( min = 0.0 , max = 1.0 )
+ if ( min >= max )
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " min must be smaller than max "
+ end
+ old_min = self . min
+ old_max = self . max
+ self . collect do | v |
+ if old_min == old_max
+ max
+ else
+ min + ( v - old_min ) * ( max - min ) / ( old_max - old_min )
+ end
+ end
+ - (Array <Symbol> ) to_sym
+ # File 'lib/fselector/util.rb', line 70
+def to_sym
+ self . collect { | x | x . to_sym }
+ - (Object ) to_zscore
+ # File 'lib/fselector/util.rb', line 60
+def to_zscore
+ ave = self . ave
+ sd = self . sd
+ return self . collect { | v | ( v - ave ) / sd }
+ - (Float ) var
+ # File 'lib/fselector/util.rb', line 22
+def var
+ u = self . ave
+ v2 = self . inject ( 0.0 ) { | v , i | v + ( i - u ) * ( i - u ) }
+ v2 / ( self . size - 1 )
\ No newline at end of file
Module: Discretilizer
+ — Documentation by YARD 0.7.5
+ Module: Discretilizer
+ Included in:
+ FSelector::BaseContinuous
+ Defined in:
+ lib/fselector/algo_continuous/discretizer.rb
discretilize continous feature
+ Instance Method Summary
+ (collapse )
Instance Method Details
+ - (Object ) discretize_chimerge! (chisq)
data structure will be altered
discretize by ChiMerge algorithm
ref: ChiMerge: Discretization of Numberic Attributes
chi-squared values and associated p values can be looked up at
+degrees of freedom: one less than number of classes
chi-squared values vs p values
+degree_of_freedom p<0.10 p<0.05 p<0.01 p<0.001
+ 1 2.71 3.84 6.64 10.83
+ 2 4.60 5.99 9.21 13.82
+ 3 6.35 7.82 11.34 16.27
+ # File 'lib/fselector/algo_continuous/discretizer.rb', line 88
+def discretize_chimerge! ( chisq )
+ hzero = { }
+ each_class do | k |
+ hzero [ k ] = 0.0
+ end
+ f2bs = { }
+ each_feature do | f |
+ bs , cs , qs = [ ] , [ ] , [ ]
+ fvs = get_feature_values ( f ) . sort . uniq
+ fvs . each_with_index do | v , i |
+ if i + 1 < fvs . size
+ bs << ( v + fvs [ i + 1 ] ) / 2.0
+ cs << hzero . dup
+ qs << 0.0
+ end
+ end
+ bs << fvs . max + 1.0 cs << hzero . dup
+ each_sample do | k , s |
+ next if not s . has_key? f
+ bs . each_with_index do | b , i |
+ if s [ f ] < b
+ cs [ i ] [ k ] += 1.0
+ break
+ end
+ end
+ end
+ cs . each_with_index do | c , i |
+ if i + 1 < cs . size
+ qs [ i ] = calc_chisq ( c , cs [ i + 1 ] )
+ end
+ end
+ until qs . empty? or qs . min > chisq
+ qs . each_with_index do | q , i |
+ if q == qs . min
+ cm = { }
+ each_class do | k |
+ cm [ k ] = cs [ i ] [ k ] + cs [ i + 1 ] [ k ]
+ end
+ if i - 1 >= 0
+ qs [ i - 1 ] = calc_chisq ( cs [ i - 1 ] , cm )
+ end
+ if i + 1 < qs . size
+ qs [ i + 1 ] = calc_chisq ( cm , cs [ i + 2 ] )
+ end
+ bs = bs [ 0 ... i ] + bs [ i + 1 ... bs . size ]
+ cs = cs [ 0 ... i ] + [ cm ] + cs [ i + 2 ... cs . size ]
+ qs = qs [ 0 ... i ] + qs [ i + 1 ... qs . size ]
+ break
+ end
+ end
+ end
+ f2bs [ f ] = bs
+ end
+ each_sample do | k , s |
+ s . keys . each do | f |
+ s [ f ] = get_index ( s [ f ] , f2bs [ f ] )
+ end
+ end
+ - (Object ) discretize_equal_frequency! (n_interval)
data structure will be altered
discretize by equal-frequency intervals
+ # File 'lib/fselector/algo_continuous/discretizer.rb', line 42
+def discretize_equal_frequency! ( n_interval )
+ n_interval = 1 if n_interval < 1
+ f2bs = Hash . new { | h , k | h [ k ] = [ ] }
+ each_feature do | f |
+ fvs = get_feature_values ( f ) . sort
+ ns = ( fvs . size . to_f / n_interval ) . round
+ fvs . each_with_index do | v , i |
+ if ( i + 1 ) % ns == 0 and ( i + 1 ) < fvs . size
+ f2bs [ f ] << ( v + fvs [ i + 1 ] ) / 2.0
+ end
+ end
+ f2bs [ f ] << fvs . max + 1.0 end
+ each_sample do | k , s |
+ s . keys . each do | f |
+ s [ f ] = get_index ( s [ f ] , f2bs [ f ] )
+ end
+ end
+ - (Object ) discretize_equal_width! (n_interval)
data structure will be altered
discretize by equal-width intervals
+ # File 'lib/fselector/algo_continuous/discretizer.rb', line 10
+def discretize_equal_width! ( n_interval )
+ n_interval = 1 if n_interval < 1
+ f2min_max = { }
+ each_feature do | f |
+ fvs = get_feature_values ( f )
+ f2min_max [ f ] = [ fvs . min , fvs . max ]
+ end
+ each_sample do | k , s |
+ s . keys . each do | f |
+ min_v , max_v = f2min_max [ f ]
+ if min_v == max_v
+ wn = 0
+ else
+ wn = ( ( s [ f ] - min_v ) * n_interval . to_f / ( max_v - min_v ) ) . to_i
+ end
+ s [ f ] = ( wn < n_interval ) ? wn : n_interval - 1
+ end
+ end
\ No newline at end of file
Module: FRank
+ — Documentation by YARD 0.7.5
+ Module: FRank
+ Defined in:
+ lib/frank/base.rb,
+ lib/frank/ensemble.rb, lib/frank/algo_discrete/GMean.rb, lib/frank/algo_discrete/Power.rb, lib/frank/algo_discrete/Random.rb, lib/frank/algo_discrete/Recall.rb, lib/frank/algo_mixed/base_mixed.rb, lib/frank/algo_discrete/Accuracy.rb, lib/frank/algo_discrete/Relief_d.rb, lib/frank/algo_continuous/TScore.rb, lib/frank/algo_discrete/F1Measure.rb, lib/frank/algo_discrete/OddsRatio.rb, lib/frank/algo_continuous/PMetric.rb, lib/frank/algo_discrete/GiniIndex.rb, lib/frank/algo_discrete/Precision.rb, lib/frank/algo_discrete/ReliefF_d.rb, lib/frank/algo_continuous/Relief_c.rb, lib/frank/algo_continuous/ReliefF_c.rb, lib/frank/algo_discrete/Sensitivity.rb, lib/frank/algo_discrete/Specificity.rb, lib/frank/algo_discrete/McNemarsTest.rb, lib/frank/algo_discrete/base_discrete.rb, lib/frank/algo_discrete/ChiSquaredTest.rb, lib/frank/algo_discrete/PhiCoefficient.rb, lib/frank/algo_discrete/GSSCoefficient.rb, lib/frank/algo_discrete/InformationGain.rb, lib/frank/algo_discrete/AccuracyBalanced.rb, lib/frank/algo_discrete/ProbabilityRatio.rb, lib/frank/algo_discrete/FishersExactTest.rb, lib/frank/algo_discrete/DocumentFrequency.rb, lib/frank/algo_discrete/InformationGain_d.rb, lib/frank/algo_discrete/MutualInformation.rb, lib/frank/algo_continuous/base_continuous.rb, lib/frank/algo_discrete/OddsRatioNumerator.rb, lib/frank/algo_discrete/BiNormalSeparation.rb, lib/frank/algo_discrete/CorrelationCoefficient.rb, lib/frank/algo_discrete/MatthewsCorrelationCoefficient.rb
FRank: a ruby package for feature selection and ranking
Defined Under Namespace
+ Classes: Accuracy , AccuracyBalanced , Base , BaseContinuous , BaseDiscrete , BaseMixed , BiNormalSeparation , ChiSquaredTest , CorrelationCoefficient , DocumentFrequency , Ensemble , F1Measure , FishersExactTest , GMean , GSSCoefficient , GiniIndex , InformationGain , InformationGain_d , MatthewsCorrelationCoefficient , McNemarsTest , MutualInformation , OddsRatio , OddsRatioNumerator , PMetric , Power , Precision , ProbabilityRatio , Random , ReliefF_c , ReliefF_d , Relief_c , Relief_d , Sensitivity , Specificity , TScore
Constant Summary
+ GM =
shortcut so that you can use FRank::GM instead of FRank::GMean
+ GMean
+ Recall =
Recall, also known as Sensitivity.
+shortcut so that you can use FRank::Recall
+ Sensitivity
+ Acc =
shortcut so that you can use FRank::Acc instead of FRank::Accuracy
+ Accuracy
+ TS =
shortcut so that you can use FRank::TS instead of FRank::TScore
+ TScore
+ F1 =
shortcut so that you can use FRank::F1 instead of FRank::F1Measure
+ F1Measure
+ Odd =
shortcut so that you can use FRank::Odd instead of FRank::OddsRatio
+ OddsRatio
+ PM =
shortcut so that you can use FRank::PM instead of FRank::PMetric
+ PMetric
+ GI =
shortcut so that you can use FRank::GI instead of FRank::GiniIndex
+ GiniIndex
+ SN =
shortcut so that you can use FRank::SN instead of FRank::Sensitivity
+ Sensitivity
+ SP =
shortcut so that you can use FRank::SP instead of FRank::Specificity
+ Specificity
+ MNT =
shortcut so that you can use FRank::MNT instead of FRank::McNemarsTest
+ McNemarsTest
+ CHI =
shortcut so that you can use FRank::CHI instead of FRank::ChiSquaredTest
+ ChiSquaredTest
+ PHI =
Phi coefficient , also known as Matthews correlation coefficient.
+shortcut so that you can use FRank::PHI
+ MatthewsCorrelationCoefficient
+ GSS =
shortcut so that you can use FRank::GSS instead of FRank::GSSCoefficient
+ GSSCoefficient
+ IG =
shortcut so that you can use FRank::IG instead of FRank::InformationGain
+ InformationGain
+ Acc2 =
shortcut so that you can use FRank::Acc2 instead of FRank::AccuracyBalanced
+ AccuracyBalanced
+ PR =
shortcut so that you can use FRank::PR instead of FRank::ProbabilityRatio
+ ProbabilityRatio
+ FET =
shortcut so that you can use FRank::FET instead of FRank::FishersExactTest
+ FishersExactTest
+ DF =
shortcut so that you can use FRank::DF instead of FRank::DocumentFrequency
+ DocumentFrequency
+ IG_d =
shortcut so that you can use FRank::IG_d instead of FRank::InformationGain_d
+ InformationGain_d
+ MI =
shortcut so that you can use FRank::MI instead of FRank::MutualInformation
+ MutualInformation
+ OddN =
shortcut so that you can use FRank::OddN instead of FRank::OddsRatioNumerator
+ OddsRatioNumerator
+ BNS =
shortcut so that you can use FRank::BNS instead of FRank::BiNormalSeparation
+ BiNormalSeparation
+ CC =
shortcut so that you can use FRank::CC instead of FRank::CorrelationCoefficient
+ CorrelationCoefficient
+ MCC =
shortcut so that you can use FRank::MCC instead of FRank::MatthewsCorrelationCoefficient
+ MatthewsCorrelationCoefficient
\ No newline at end of file
Class: FRank::Accuracy
+ — Documentation by YARD 0.7.5
+ Class: FRank::Accuracy
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/Accuracy.rb
Accuracy (Acc)
tp+tn A+D
+Acc = ------------- = ---------
+ tp+fn+tn+fp A+B+C+D
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::AccuracyBalanced
+ — Documentation by YARD 0.7.5
+ Class: FRank::AccuracyBalanced
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/AccuracyBalanced.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::Base
+ — Documentation by YARD 0.7.5
+ Class: FRank::Base
+ Inherits:
+ Object
+ Object
+ FRank::Base
+ show all
+ Includes:
+ FileIO
+ Defined in:
+ lib/frank/base.rb
base ranking algorithm
+ Instance Method Summary
+ (collapse )
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (Base ) initialize (data = nil)
initialize from an existing data structure
+ # File 'lib/frank/base.rb', line 13
+def initialize ( data = nil )
+ @data = data
+ @opts = { } end
Instance Method Details
+ - (Object ) each_class
iterator for each class
e . g .
+self . each_class do | k |
+ puts k
+ # File 'lib/frank/base.rb', line 27
+def each_class
+ if not block_given?
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " block must be given! "
+ else
+ get_classes . each { | k | yield k }
+ end
+ - (Object ) each_feature
iterator for each feature
e . g .
+self . each_feature do | f |
+ puts f
+ # File 'lib/frank/base.rb', line 45
+def each_feature
+ if not block_given?
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " block must be given! "
+ else
+ get_features . each { | f | yield f }
+ end
+ - (Object ) each_sample
iterator for each sample with class label
e . g .
+self . each_sample do | k , s |
+ print k
+ s . each { | f , v | ' ' + v }
+ puts
+ # File 'lib/frank/base.rb', line 65
+def each_sample
+ if not block_given?
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " block must be given! "
+ else
+ get_data . each do | k , samples |
+ samples . each { | s | yield k , s }
+ end
+ end
+ - (Object ) get_classes
+ # File 'lib/frank/base.rb', line 78
+def get_classes
+ @classes ||= @data . keys
+ - (Object ) get_data
+ # File 'lib/frank/base.rb', line 130
+def get_data
+ @data
+ - (Hash ) get_feature_ranks
get the ranked features based on their best scores
+ # File 'lib/frank/base.rb', line 236
+def get_feature_ranks
+ return @ranks if @ranks
+ scores = get_feature_scores
+ @ranks = { }
+ sorted_features = scores . keys . sort do | x , y |
+ scores [ y ] [ :BEST ] <=> scores [ x ] [ :BEST ]
+ end
+ sorted_features . each_with_index do | sf , si |
+ @ranks [ sf ] = si + 1
+ end
+ @ranks
+ - (Hash ) get_feature_scores
get scores of all features for all classes
+ # File 'lib/frank/base.rb', line 206
+def get_feature_scores
+ return @scores if @scores
+ each_feature do | f |
+ calc_contribution ( f )
+ end
+ @scores . each do | f , ks |
+ @scores [ f ] [ :BEST ] = ks . values . max
+ end
+ @scores
+ - (Object ) get_feature_values (f)
get feature values
+ # File 'lib/frank/base.rb', line 105
+def get_feature_values ( f )
+ @fvs ||= { }
+ if not @fvs . has_key? f
+ @fvs [ f ] = [ ]
+ each_sample do | k , s |
+ @fvs [ f ] << s [ f ] if s . has_key? f
+ end
+ end
+ @fvs [ f ]
+ - (Object ) get_features
get unique features
+ # File 'lib/frank/base.rb', line 95
+def get_features
+ @features ||= @data . map { | x | x [ 1 ] . map { | y | y . keys } } . flatten . uniq
+ - (Object ) get_opt (key)
get non-data information
+ # File 'lib/frank/base.rb', line 149
+def get_opt ( key )
+ @opts . has_key? ( key ) ? @opts [ key ] : nil
+ - (Object ) get_sample_size
number of samples
+ # File 'lib/frank/base.rb', line 161
+def get_sample_size
+ @sz ||= get_data . values . flatten . size
+ - (Object ) print_feature_ranks
print feature ranks
+ # File 'lib/frank/base.rb', line 191
+def print_feature_ranks
+ ranks = get_feature_ranks
+ ranks . each do | f , r |
+ puts " #{ f } => #{ r } "
+ end
+ - (Object ) print_feature_scores (feat = nil, kclass = nil)
print feature scores
+ # File 'lib/frank/base.rb', line 171
+def print_feature_scores ( feat = nil , kclass = nil )
+ scores = get_feature_scores
+ scores . each do | f , ks |
+ next if feat and feat != f
+ print " #{ f } => "
+ ks . each do | k , s |
+ if kclass
+ print " #{ k } -> #{ s } " if k == kclass
+ else
+ print " #{ k } -> #{ s } "
+ end
+ end
+ puts
+ end
+ - (Hash ) select_data_by_rank! (criterion, my_ranks = nil)
data structure will be altered
reconstruct data by rank
+ # File 'lib/frank/base.rb', line 298
+def select_data_by_rank! ( criterion , my_ranks = nil )
+ ranks = my_ranks || get_feature_ranks
+ my_data = { }
+ each_sample do | k , s |
+ my_data [ k ] ||= [ ]
+ my_s = { }
+ s . each do | f , v |
+ my_s [ f ] = v if eval ( " #{ ranks [ f ] } #{ criterion } " )
+ end
+ my_data [ k ] << my_s if not my_s . empty?
+ end
+ set_data ( my_data )
+ - (Hash ) select_data_by_score! (criterion, my_scores = nil)
data structure will be altered
reconstruct data with feature scores satisfying cutoff
+ # File 'lib/frank/base.rb', line 267
+def select_data_by_score! ( criterion , my_scores = nil )
+ scores = my_scores || get_feature_scores
+ my_data = { }
+ each_sample do | k , s |
+ my_data [ k ] ||= [ ]
+ my_s = { }
+ s . each do | f , v |
+ my_s [ f ] = v if eval ( " #{ scores [ f ] [ :BEST ] } #{ criterion } " )
+ end
+ my_data [ k ] << my_s if not my_s . empty?
+ end
+ set_data ( my_data )
+ - (Object ) set_classes (classes)
+ # File 'lib/frank/base.rb', line 84
+def set_classes ( classes )
+ if classes and classes . class == Array
+ @classes = classes
+ else
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " classes must be a Array object! "
+ end
+ - (Object ) set_data (data)
+ # File 'lib/frank/base.rb', line 135
+def set_data ( data )
+ if data and data . class == Hash
+ @data = data
+ @classes , @features , @fvs = nil , nil , nil
+ @scores , @ranks , @sz = nil , nil , nil
+ else
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " data must be a Hash object! "
+ end
+ - (Object ) set_feature_score (f, k, s)
set feature (f) score (f) for class (k)
+ # File 'lib/frank/base.rb', line 225
+def set_feature_score ( f , k , s )
+ @scores ||= { }
+ @scores [ f ] ||= { }
+ @scores [ f ] [ k ] = s
+ - (Object ) set_features (features)
+ # File 'lib/frank/base.rb', line 119
+def set_features ( features )
+ if features and features . class == Array
+ @features = features
+ else
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " features must be a Array object! "
+ end
+ - (Object ) set_opt (key, value)
set non-data information as a key-value pair
+ # File 'lib/frank/base.rb', line 155
+def set_opt ( key , value )
+ @opts [ key ] = value
\ No newline at end of file
Class: FRank::BaseContinuous
+ — Documentation by YARD 0.7.5
+ Class: FRank::BaseContinuous
+ Inherits:
+ Base
+ Object
+ Base
+ FRank::BaseContinuous
+ show all
+ Includes:
+ Discretilizer , Normalizer
+ Defined in:
+ lib/frank/algo_continuous/base_continuous.rb
base ranking algorithm for handling continous feature
+ Instance Method Summary
+ (collapse )
#discretize_chimerge! , #discretize_equal_frequency! , #discretize_equal_width!
Methods included from Normalizer
#normalize_log! , #normalize_min_max! , #normalize_zscore!
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (BaseContinuous ) initialize (data = nil)
initialize from an existing data structure
+ # File 'lib/frank/algo_continuous/base_continuous.rb', line 17
+def initialize ( data = nil )
+ super ( data )
\ No newline at end of file
Class: FRank::BaseDiscrete
+ — Documentation by YARD 0.7.5
+ Class: FRank::BaseDiscrete
+ Inherits:
+ Base
+ Object
+ Base
+ FRank::BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/base_discrete.rb
base ranking alogrithm for handling discrete feature
2 x 2 contingency table
+ c c'
+ ---------
+ f | A | B | A+B
+ |---|---|
+ f' | C | D | C+D
+ ---------
+ A+C B+D N = A+B+C+D
+ P(f) = (A+B)/N
+ P(f') = (C+D)/N
+ P(c) = (A+C)/N
+ P(c') = (B+D)/N
+ P(f,c) = A/N
+ P(f,c') = B/N
+ P(f',c) = C/N
+ P(f',c') = D/N
Direct Known Subclasses
Accuracy , AccuracyBalanced , BiNormalSeparation , ChiSquaredTest , CorrelationCoefficient , DocumentFrequency , F1Measure , FishersExactTest , GMean , GSSCoefficient , GiniIndex , InformationGain , InformationGain_d , MatthewsCorrelationCoefficient , McNemarsTest , MutualInformation , OddsRatio , OddsRatioNumerator , Power , Precision , ProbabilityRatio , Random , ReliefF_d , Relief_d , Sensitivity , Specificity
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (BaseDiscrete ) initialize (data = nil)
initialize from an existing data structure
+ # File 'lib/frank/algo_discrete/base_discrete.rb', line 29
+def initialize ( data = nil )
+ super ( data )
\ No newline at end of file
Class: FRank::BaseMixed
+ — Documentation by YARD 0.7.5
+ Class: FRank::BaseMixed
+ Inherits:
+ Base
+ Object
+ Base
+ FRank::BaseMixed
+ show all
+ Defined in:
+ lib/frank/algo_mixed/base_mixed.rb
base class for handling feature of mixed data
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (BaseMixed ) initialize (data = nil)
initialize from an existing data structure
+ # File 'lib/frank/algo_mixed/base_mixed.rb', line 10
+def initialize ( data = nil )
+ super ( data )
\ No newline at end of file
Class: FRank::BiNormalSeparation
+ — Documentation by YARD 0.7.5
+ Class: FRank::BiNormalSeparation
+ Inherits:
+ BaseDiscrete
+ show all
+ Includes:
+ Rubystats
+ Defined in:
+ lib/frank/algo_discrete/BiNormalSeparation.rb
Constant Summary
Constant Summary
Constants included
+ from Rubystats
Rubystats::MAX_VALUE , Rubystats::SQRT2 , Rubystats::SQRT2PI , Rubystats::TWO_PI
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::ChiSquaredTest
+ — Documentation by YARD 0.7.5
+ Class: FRank::ChiSquaredTest
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/ChiSquaredTest.rb
Chi-Squared test (CHI)
N * ( P(f,c) * P(f',c') - P(f,c') * P(f',c) )^2
+ CHI(f,c) = -------------------------------------------------
+ P(f) * P(f') * P(c) * P(c')
+ N * (A*D - B*C)^2
+ = -------------------------------
+ (A+B) * (C+D) * (A+C) * (B+D)
suitable for large samples and
+none of the values of (A, B, C, D) < 5
ref: Wikipedia
+ and A Comparative Study on Feature Selection Methods for
+ Drug Discovery
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (ChiSquaredTest ) initialize (correction = nil, data = nil)
+ # File 'lib/frank/algo_discrete/ChiSquaredTest.rb', line 30
+def initialize ( correction = nil , data = nil )
+ super ( data )
+ @correction = ( correction == :yates ) ? true : false
\ No newline at end of file
diff --git a/doc/FRank/CorrelationCoefficient.html b/doc/FRank/CorrelationCoefficient.html
new file mode 100644
index 0000000..c4fbd7b
--- /dev/null
+++ b/doc/FRank/CorrelationCoefficient.html
@@ -0,0 +1,172 @@
+ Class: FRank::CorrelationCoefficient
+ — Documentation by YARD 0.7.5
+ Class: FRank::CorrelationCoefficient
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/CorrelationCoefficient.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
diff --git a/doc/FRank/DocumentFrequency.html b/doc/FRank/DocumentFrequency.html
new file mode 100644
index 0000000..8b17f5f
--- /dev/null
+++ b/doc/FRank/DocumentFrequency.html
@@ -0,0 +1,169 @@
Class: FRank::DocumentFrequency
+ — Documentation by YARD 0.7.5
+ Class: FRank::DocumentFrequency
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/DocumentFrequency.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
diff --git a/doc/FRank/Ensemble.html b/doc/FRank/Ensemble.html
new file mode 100644
index 0000000..bd893ac
--- /dev/null
+++ b/doc/FRank/Ensemble.html
Class: FRank::Ensemble
+ Class: FRank::Ensemble
+ — Documentation by YARD 0.7.5
+ Class: FRank::Ensemble
+ Inherits:
+ Base
+ Object
+ Base
+ FRank::Ensemble
+ show all
+ Defined in:
+ lib/frank/ensemble.rb
select feature by an ensemble of ranking algorithms
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (Ensemble ) initialize (*algos)
+ # File 'lib/frank/ensemble.rb', line 10
+def initialize ( * algos )
+ super ( nil )
+ @algos = [ ]
+ algos . each do | r |
+ @algos << r
+ end
Instance Method Details
+ - (Object ) by_ave (arr)
by average value of an array
+ # File 'lib/frank/ensemble.rb', line 125
+def by_ave ( arr )
+ arr . ave if arr . class == Array
+ - (Object ) by_max (arr)
by max value of an array
+ # File 'lib/frank/ensemble.rb', line 137
+def by_max ( arr )
+ arr . max if arr . class == Array
+ - (Object ) by_min (arr)
by min value of an array
+ # File 'lib/frank/ensemble.rb', line 131
+def by_min ( arr )
+ arr . min if arr . class == Array
+ - (Object ) ensemble_by_rank (by_what = method(:by_min))
ensemble based on rank
+ # File 'lib/frank/ensemble.rb', line 102
+def ensemble_by_rank ( by_what = method ( :by_min ) )
+ ranks = { }
+ each_feature do | f |
+ ranks [ f ] = by_what . call (
+ @algos . collect { | r | r . get_feature_ranks [ f ] }
+ )
+ end
+ new_ranks = { }
+ sorted_features = ranks . keys . sort do | x , y |
+ ranks [ x ] <=> ranks [ y ]
+ end
+ sorted_features . each_with_index do | sf , si |
+ new_ranks [ sf ] = si + 1
+ end
+ @ranks = new_ranks
+ - (Object ) ensemble_by_score (by_what = method(:by_max), norm = :min_max)
scores from different algos are usually incompatible with
+each other, we have to normalize it first
ensemble based on score
+ # File 'lib/frank/ensemble.rb', line 70
+def ensemble_by_score ( by_what = method ( :by_max ) , norm = :min_max )
+ @algos . each do | r |
+ if norm == :min_max
+ normalize_min_max! ( r )
+ elsif norm == :zscore
+ normalize_zscore! ( r )
+ else
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " invalid normalizer, only :min_max and :zscore supported! "
+ end
+ end
+ @scores = { }
+ each_feature do | f |
+ @scores [ f ] = { }
+ @scores [ f ] [ :BEST ] = by_what . call (
+ @algos . collect { | r | r . get_feature_scores [ f ] [ :BEST ] }
+ )
+ end
+ - (Object ) get_feature_ranks
reload get_feature_ranks
+ # File 'lib/frank/ensemble.rb', line 47
+def get_feature_ranks
+ return @ranks if @ranks
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " please call one consensus ranking method first! "
+ - (Object ) get_feature_scores
reload get_feature_scores
+ # File 'lib/frank/ensemble.rb', line 36
+def get_feature_scores
+ return @scores if @scores
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " please call one consensus scoring method first! "
+ - (Object ) set_data (data)
all algos share the same data structure
reload set_data
+ # File 'lib/frank/ensemble.rb', line 25
+def set_data ( data )
+ @data = data
+ @algos . each do | r |
+ r . set_data ( data )
+ end
\ No newline at end of file
Class: FRank::F1Measure
+ — Documentation by YARD 0.7.5
+ Class: FRank::F1Measure
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/F1Measure.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
diff --git a/doc/FRank/FishersExactTest.html b/doc/FRank/FishersExactTest.html
Class: FRank::FishersExactTest
+ Class: FRank::FishersExactTest
+ Inherits:
+ BaseDiscrete
+ show all
+ Includes:
+ Rubystats
+ Defined in:
+ lib/frank/algo_discrete/FishersExactTest.rb
(two-sided) Fisher's Exact Test (FET)
(A+B)! * (C+D)! * (A+C)! * (B+D)!
+p = -----------------------------------
+ A! * B! * C! * D!
+for FET, the smaller, the better, but we intentionally negate it
+so that the larger is always the better (consistent with other algorithms)
ref: Wikipedia and Rubystats
Constant Summary
Constant Summary
Constants included
+ from Rubystats
Rubystats::MAX_VALUE , Rubystats::SQRT2 , Rubystats::SQRT2PI , Rubystats::TWO_PI
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::GMean
+ — Documentation by YARD 0.7.5
+ Class: FRank::GMean
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/GMean.rb
GMean (GM)
GM = sqrt(Sensitivity * Specificity)
+ = sqrt(------------------) = sqrt(---------------)
+ (TP+FN) * (TN+FP) (A+C) * (B+D)
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::GSSCoefficient
+ — Documentation by YARD 0.7.5
+ Class: FRank::GSSCoefficient
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/GSSCoefficient.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::GiniIndex
+ — Documentation by YARD 0.7.5
+ Class: FRank::GiniIndex
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/GiniIndex.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::InformationGain
+ — Documentation by YARD 0.7.5
+ Class: FRank::InformationGain
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/InformationGain.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::InformationGain_d
+ — Documentation by YARD 0.7.5
+ Class: FRank::InformationGain_d
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/InformationGain_d.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::MatthewsCorrelationCoefficient
+ — Documentation by YARD 0.7.5
+ Class: FRank::MatthewsCorrelationCoefficient
+ Inherits:
+ BaseDiscrete
+ Object
+ Base
+ BaseDiscrete
+ FRank::MatthewsCorrelationCoefficient
+ show all
+ Defined in:
+ lib/frank/algo_discrete/MatthewsCorrelationCoefficient.rb
Matthews Correlation Coefficient (MCC), also known as Phi coefficient
tp*tn - fp*fn
+MCC = ---------------------------------------------- = PHI = sqrt(CHI/N)
+ sqrt((tp+fp) * (tp+fn) * (tn+fp) * (tn+fn) )
+ A*D - B*C
+ = -------------------------------------
+ sqrt((A+B) * (A+C) * (B+D) * (C+D))
ref: Wikipedia
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::McNemarsTest
+ — Documentation by YARD 0.7.5
+ Class: FRank::McNemarsTest
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/McNemarsTest.rb
McNemar's test (MN), based on Chi-Squared test
+MN(f, c) = ---------
+ B+C
suitable for large samples and B+C >= 25
ref: Wikipedia
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (McNemarsTest ) initialize (correction = nil, data = nil)
+ # File 'lib/frank/algo_discrete/McNemarsTest.rb', line 22
+def initialize ( correction = nil , data = nil )
+ super ( data )
+ @correction = ( correction == :yates ) ? true : false
\ No newline at end of file
Class: FRank::MutualInformation
+ — Documentation by YARD 0.7.5
+ Class: FRank::MutualInformation
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/MutualInformation.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::OddsRatio
+ — Documentation by YARD 0.7.5
+ Class: FRank::OddsRatio
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/OddsRatio.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::OddsRatioNumerator
+ — Documentation by YARD 0.7.5
+ Class: FRank::OddsRatioNumerator
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/OddsRatioNumerator.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::PMetric
+ — Documentation by YARD 0.7.5
+ Class: FRank::PMetric
+ Inherits:
+ BaseContinuous
+ show all
+ Defined in:
+ lib/frank/algo_continuous/PMetric.rb
Method Summary
#discretize_chimerge! , #discretize_equal_frequency! , #discretize_equal_width!
Methods included from Normalizer
#normalize_log! , #normalize_min_max! , #normalize_zscore!
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::Power
+ — Documentation by YARD 0.7.5
+ Class: FRank::Power
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/Power.rb
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (Power ) initialize (k = 5, data = nil)
+ # File 'lib/frank/algo_discrete/Power.rb', line 24
+def initialize ( k = 5 , data = nil )
+ super ( data )
+ @k = k
\ No newline at end of file
Class: FRank::Precision
+ — Documentation by YARD 0.7.5
+ Class: FRank::Precision
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/Precision.rb
+Precision = ------- = -----
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::ProbabilityRatio
+ — Documentation by YARD 0.7.5
+ Class: FRank::ProbabilityRatio
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/ProbabilityRatio.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::Random
+ — Documentation by YARD 0.7.5
+ Class: FRank::Random
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/Random.rb
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (Random ) initialize (seed = nil, data = nil)
initialize from an existing data structure
+ # File 'lib/frank/algo_discrete/Random.rb', line 22
+def initialize ( seed = nil , data = nil )
+ super ( data )
+ srand ( seed ) if seed
\ No newline at end of file
Class: FRank::ReliefF_c
+ — Documentation by YARD 0.7.5
+ Class: FRank::ReliefF_c
+ Inherits:
+ BaseContinuous
+ show all
+ Defined in:
+ lib/frank/algo_continuous/ReliefF_c.rb
+ Instance Method Summary
+ (collapse )
#discretize_chimerge! , #discretize_equal_frequency! , #discretize_equal_width!
Methods included from Normalizer
#normalize_log! , #normalize_min_max! , #normalize_zscore!
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (ReliefF_c ) initialize (m = nil, k = 10, data = nil)
+ # File 'lib/frank/algo_continuous/ReliefF_c.rb', line 23
+def initialize ( m = nil , k = 10 , data = nil )
+ super ( data )
+ @m = m @k = ( k || 10 ) end
\ No newline at end of file
Class: FRank::ReliefF_d
+ — Documentation by YARD 0.7.5
+ Class: FRank::ReliefF_d
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/ReliefF_d.rb
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (ReliefF_d ) initialize (m = nil, k = 10, data = nil)
+ # File 'lib/frank/algo_discrete/ReliefF_d.rb', line 22
+def initialize ( m = nil , k = 10 , data = nil )
+ super ( data )
+ @m = m @k = ( k || 10 ) end
\ No newline at end of file
Class: FRank::Relief_c
+ — Documentation by YARD 0.7.5
+ Class: FRank::Relief_c
+ Inherits:
+ BaseContinuous
+ show all
+ Defined in:
+ lib/frank/algo_continuous/Relief_c.rb
+ Instance Method Summary
+ (collapse )
#discretize_chimerge! , #discretize_equal_frequency! , #discretize_equal_width!
Methods included from Normalizer
#normalize_log! , #normalize_min_max! , #normalize_zscore!
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (Relief_c ) initialize (m = nil, data = nil)
+ # File 'lib/frank/algo_continuous/Relief_c.rb', line 23
+def initialize ( m = nil , data = nil )
+ super ( data )
+ @m = m end
\ No newline at end of file
Class: FRank::Relief_d
+ — Documentation by YARD 0.7.5
+ Class: FRank::Relief_d
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/Relief_d.rb
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (Relief_d ) initialize (m = nil, data = nil)
+ # File 'lib/frank/algo_discrete/Relief_d.rb', line 23
+def initialize ( m = nil , data = nil )
+ super ( data )
+ @m = m end
\ No newline at end of file
Class: FRank::Sensitivity
+ — Documentation by YARD 0.7.5
+ Class: FRank::Sensitivity
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/Sensitivity.rb
Sensitivity (SN), also known as Recall
+SN = ------- = -----
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::Specificity
+ — Documentation by YARD 0.7.5
+ Class: FRank::Specificity
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/frank/algo_discrete/Specificity.rb
Specificity (SP)
+SP = ------- = -----
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FRank::TScore
+ — Documentation by YARD 0.7.5
+ Class: FRank::TScore
+ Inherits:
+ BaseContinuous
+ show all
+ Defined in:
+ lib/frank/algo_continuous/TScore.rb
TS applicable only to two-class problems
t-score (TS) based on Student's t-test for continous feature
|u1 - u2|
+TS(f) = --------------------------------------------
+ sqrt((n1*sigma1^2 + n_2*sigma2^2)/(n1+n2))
ref: Filter versus wrapper gene selection approaches
Method Summary
#discretize_chimerge! , #discretize_equal_frequency! , #discretize_equal_width!
Methods included from Normalizer
#normalize_log! , #normalize_min_max! , #normalize_zscore!
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Module: FSelector
+ — Documentation by YARD 0.7.5
+ Module: FSelector
+ Defined in:
+ lib/fselector.rb,
+ lib/fselector/base.rb, lib/fselector/ensemble.rb, lib/fselector/base_discrete.rb, lib/fselector/base_continuous.rb, lib/fselector/algo_discrete/GMean.rb, lib/fselector/algo_discrete/Power.rb, lib/fselector/algo_discrete/Random.rb, lib/fselector/algo_discrete/Accuracy.rb, lib/fselector/algo_continuous/TScore.rb, lib/fselector/algo_discrete/Relief_d.rb, lib/fselector/algo_continuous/PMetric.rb, lib/fselector/algo_discrete/GiniIndex.rb, lib/fselector/algo_discrete/ReliefF_d.rb, lib/fselector/algo_discrete/OddsRatio.rb, lib/fselector/algo_discrete/Precision.rb, lib/fselector/algo_discrete/F1Measure.rb, lib/fselector/algo_continuous/Relief_c.rb, lib/fselector/algo_discrete/Specificity.rb, lib/fselector/algo_discrete/Sensitivity.rb, lib/fselector/algo_continuous/ReliefF_c.rb, lib/fselector/algo_discrete/McNemarsTest.rb, lib/fselector/algo_discrete/ChiSquaredTest.rb, lib/fselector/algo_discrete/GSSCoefficient.rb, lib/fselector/algo_discrete/InformationGain.rb, lib/fselector/algo_discrete/FishersExactTest.rb, lib/fselector/algo_discrete/ProbabilityRatio.rb, lib/fselector/algo_discrete/AccuracyBalanced.rb, lib/fselector/algo_discrete/DocumentFrequency.rb, lib/fselector/algo_discrete/MutualInformation.rb, lib/fselector/algo_discrete/OddsRatioNumerator.rb, lib/fselector/algo_discrete/BiNormalSeparation.rb, lib/fselector/algo_discrete/CorrelationCoefficient.rb, lib/fselector/algo_discrete/MatthewsCorrelationCoefficient.rb
FSelector: a Ruby gem for feature selection and ranking
Defined Under Namespace
+ Classes: Accuracy , AccuracyBalanced , Base , BaseContinuous , BaseDiscrete , BiNormalSeparation , ChiSquaredTest , CorrelationCoefficient , DocumentFrequency , Ensemble , F1Measure , FishersExactTest , GMean , GSSCoefficient , GiniIndex , InformationGain , MatthewsCorrelationCoefficient , McNemarsTest , MutualInformation , OddsRatio , OddsRatioNumerator , PMetric , Power , Precision , ProbabilityRatio , Random , ReliefF_c , ReliefF_d , Relief_c , Relief_d , Sensitivity , Specificity , TScore
Constant Summary
+ ' 0.1.0 '
+ GM =
shortcut so that you can use FSelector::GM instead of FSelector::GMean
+ GMean
+ Acc =
shortcut so that you can use FSelector::Acc instead of FSelector::Accuracy
+ Accuracy
+ TS =
shortcut so that you can use FSelector::TS instead of FSelector::TScore
+ TScore
+ PM =
shortcut so that you can use FSelector::PM instead of FSelector::PMetric
+ PMetric
+ GI =
shortcut so that you can use FSelector::GI instead of FSelector::GiniIndex
+ GiniIndex
+ Odd =
shortcut so that you can use FSelector::Odd instead of FSelector::OddsRatio
+ OddsRatio
+ F1 =
shortcut so that you can use FSelector::F1 instead of FSelector::F1Measure
+ F1Measure
+ SP =
shortcut so that you can use FSelector::SP instead of FSelector::Specificity
+ Specificity
+ SN =
shortcut so that you can use FSelector::SN instead of FSelector::Sensitivity
+ Sensitivity
+ Recall =
Sensitivity, also known as Recall
+ Sensitivity
+ MNT =
shortcut so that you can use FSelector::MNT instead of FSelector::McNemarsTest
+ McNemarsTest
+ CHI =
shortcut so that you can use FSelector::CHI instead of FSelector::ChiSquaredTest
+ ChiSquaredTest
+ GSS =
shortcut so that you can use FSelector::GSS instead of FSelector::GSSCoefficient
+ GSSCoefficient
+ IG =
shortcut so that you can use FSelector::IG instead of FSelector::InformationGain
+ InformationGain
+ FET =
shortcut so that you can use FSelector::FET instead of FSelector::FishersExactTest
+ FishersExactTest
+ PR =
shortcut so that you can use FSelector::PR instead of FSelector::ProbabilityRatio
+ ProbabilityRatio
+ Acc2 =
shortcut so that you can use FSelector::Acc2 instead of FSelector::AccuracyBalanced
+ AccuracyBalanced
+ DF =
shortcut so that you can use FSelector::DF instead of FSelector::DocumentFrequency
+ DocumentFrequency
+ MI =
shortcut so that you can use FSelector::MI instead of FSelector::MutualInformation
+ MutualInformation
+ OddN =
shortcut so that you can use FSelector::OddN instead of FSelector::OddsRatioNumerator
+ OddsRatioNumerator
+ BNS =
shortcut so that you can use FSelector::BNS instead of FSelector::BiNormalSeparation
+ BiNormalSeparation
+ CC =
shortcut so that you can use FSelector::CC instead of FSelector::CorrelationCoefficient
+ CorrelationCoefficient
+ MCC =
shortcut so that you can use FSelector::MCC instead of FSelector::MatthewsCorrelationCoefficient
+ MatthewsCorrelationCoefficient
+ PHI =
Matthews Correlation Coefficient (MCC), also known as Phi coefficient
+ MatthewsCorrelationCoefficient
\ No newline at end of file
Class: FSelector::Accuracy
+ — Documentation by YARD 0.7.5
+ Class: FSelector::Accuracy
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/fselector/algo_discrete/Accuracy.rb
Accuracy (Acc)
tp+tn A+D
+Acc = ------------- = ---------
+ tp+fn+tn+fp A+B+C+D
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FSelector::AccuracyBalanced
+ — Documentation by YARD 0.7.5
+ Class: FSelector::AccuracyBalanced
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/fselector/algo_discrete/AccuracyBalanced.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FSelector::Base
+ — Documentation by YARD 0.7.5
+ Class: FSelector::Base
+ Inherits:
+ Object
+ Object
+ FSelector::Base
+ show all
+ Includes:
+ FileIO
+ Defined in:
+ lib/fselector/base.rb
base ranking algorithm
+ Instance Method Summary
+ (collapse )
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (Base ) initialize (data = nil)
initialize from an existing data structure
+ # File 'lib/fselector/base.rb', line 13
+def initialize ( data = nil )
+ @data = data
+ @opts = { } end
Instance Method Details
+ - (Object ) each_class
iterator for each class
e . g .
+self . each_class do | k |
+ puts k
+ # File 'lib/fselector/base.rb', line 27
+def each_class
+ if not block_given?
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " block must be given! "
+ else
+ get_classes . each { | k | yield k }
+ end
+ - (Object ) each_feature
iterator for each feature
e . g .
+self . each_feature do | f |
+ puts f
+ # File 'lib/fselector/base.rb', line 45
+def each_feature
+ if not block_given?
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " block must be given! "
+ else
+ get_features . each { | f | yield f }
+ end
+ - (Object ) each_sample
iterator for each sample with class label
e . g .
+self . each_sample do | k , s |
+ print k
+ s . each { | f , v | ' ' + v }
+ puts
+ # File 'lib/fselector/base.rb', line 65
+def each_sample
+ if not block_given?
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " block must be given! "
+ else
+ get_data . each do | k , samples |
+ samples . each { | s | yield k , s }
+ end
+ end
+ - (Object ) get_classes
+ # File 'lib/fselector/base.rb', line 78
+def get_classes
+ @classes ||= @data . keys
+ - (Object ) get_data
+ # File 'lib/fselector/base.rb', line 130
+def get_data
+ @data
+ - (Hash ) get_feature_ranks
get the ranked features based on their best scores
+ # File 'lib/fselector/base.rb', line 236
+def get_feature_ranks
+ return @ranks if @ranks
+ scores = get_feature_scores
+ @ranks = { }
+ sorted_features = scores . keys . sort do | x , y |
+ scores [ y ] [ :BEST ] <=> scores [ x ] [ :BEST ]
+ end
+ sorted_features . each_with_index do | sf , si |
+ @ranks [ sf ] = si + 1
+ end
+ @ranks
+ - (Hash ) get_feature_scores
get scores of all features for all classes
+ # File 'lib/fselector/base.rb', line 206
+def get_feature_scores
+ return @scores if @scores
+ each_feature do | f |
+ calc_contribution ( f )
+ end
+ @scores . each do | f , ks |
+ @scores [ f ] [ :BEST ] = ks . values . max
+ end
+ @scores
+ - (Object ) get_feature_values (f)
get feature values
+ # File 'lib/fselector/base.rb', line 105
+def get_feature_values ( f )
+ @fvs ||= { }
+ if not @fvs . has_key? f
+ @fvs [ f ] = [ ]
+ each_sample do | k , s |
+ @fvs [ f ] << s [ f ] if s . has_key? f
+ end
+ end
+ @fvs [ f ]
+ - (Object ) get_features
get unique features
+ # File 'lib/fselector/base.rb', line 95
+def get_features
+ @features ||= @data . map { | x | x [ 1 ] . map { | y | y . keys } } . flatten . uniq
+ - (Object ) get_opt (key)
get non-data information
+ # File 'lib/fselector/base.rb', line 149
+def get_opt ( key )
+ @opts . has_key? ( key ) ? @opts [ key ] : nil
+ - (Object ) get_sample_size
number of samples
+ # File 'lib/fselector/base.rb', line 161
+def get_sample_size
+ @sz ||= get_data . values . flatten . size
+ - (Object ) print_feature_ranks
print feature ranks
+ # File 'lib/fselector/base.rb', line 191
+def print_feature_ranks
+ ranks = get_feature_ranks
+ ranks . each do | f , r |
+ puts " #{ f } => #{ r } "
+ end
+ - (Object ) print_feature_scores (feat = nil, kclass = nil)
print feature scores
+ # File 'lib/fselector/base.rb', line 171
+def print_feature_scores ( feat = nil , kclass = nil )
+ scores = get_feature_scores
+ scores . each do | f , ks |
+ next if feat and feat != f
+ print " #{ f } => "
+ ks . each do | k , s |
+ if kclass
+ print " #{ k } -> #{ s } " if k == kclass
+ else
+ print " #{ k } -> #{ s } "
+ end
+ end
+ puts
+ end
+ - (Hash ) select_data_by_rank! (criterion, my_ranks = nil)
data structure will be altered
reconstruct data by rank
+ # File 'lib/fselector/base.rb', line 298
+def select_data_by_rank! ( criterion , my_ranks = nil )
+ ranks = my_ranks || get_feature_ranks
+ my_data = { }
+ each_sample do | k , s |
+ my_data [ k ] ||= [ ]
+ my_s = { }
+ s . each do | f , v |
+ my_s [ f ] = v if eval ( " #{ ranks [ f ] } #{ criterion } " )
+ end
+ my_data [ k ] << my_s if not my_s . empty?
+ end
+ set_data ( my_data )
+ - (Hash ) select_data_by_score! (criterion, my_scores = nil)
data structure will be altered
reconstruct data with feature scores satisfying cutoff
+ # File 'lib/fselector/base.rb', line 267
+def select_data_by_score! ( criterion , my_scores = nil )
+ scores = my_scores || get_feature_scores
+ my_data = { }
+ each_sample do | k , s |
+ my_data [ k ] ||= [ ]
+ my_s = { }
+ s . each do | f , v |
+ my_s [ f ] = v if eval ( " #{ scores [ f ] [ :BEST ] } #{ criterion } " )
+ end
+ my_data [ k ] << my_s if not my_s . empty?
+ end
+ set_data ( my_data )
+ - (Object ) set_classes (classes)
+ # File 'lib/fselector/base.rb', line 84
+def set_classes ( classes )
+ if classes and classes . class == Array
+ @classes = classes
+ else
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " classes must be a Array object! "
+ end
+ - (Object ) set_data (data)
+ # File 'lib/fselector/base.rb', line 135
+def set_data ( data )
+ if data and data . class == Hash
+ @data = data
+ @classes , @features , @fvs = nil , nil , nil
+ @scores , @ranks , @sz = nil , nil , nil
+ else
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " data must be a Hash object! "
+ end
+ - (Object ) set_feature_score (f, k, s)
set feature (f) score (f) for class (k)
+ # File 'lib/fselector/base.rb', line 225
+def set_feature_score ( f , k , s )
+ @scores ||= { }
+ @scores [ f ] ||= { }
+ @scores [ f ] [ k ] = s
+ - (Object ) set_features (features)
+ # File 'lib/fselector/base.rb', line 119
+def set_features ( features )
+ if features and features . class == Array
+ @features = features
+ else
+ abort " [ #{ __FILE__ } @ #{ __LINE__ } ]: " +
+ " features must be a Array object! "
+ end
+ - (Object ) set_opt (key, value)
set non-data information as a key-value pair
+ # File 'lib/fselector/base.rb', line 155
+def set_opt ( key , value )
+ @opts [ key ] = value
\ No newline at end of file
Class: FSelector::BaseContinuous
+ — Documentation by YARD 0.7.5
+ Class: FSelector::BaseContinuous
+ Inherits:
+ Base
+ Object
+ Base
+ FSelector::BaseContinuous
+ show all
+ Includes:
+ Discretilizer , Normalizer
+ Defined in:
+ lib/fselector/base_continuous.rb
base ranking algorithm for handling continous feature
+ Instance Method Summary
+ (collapse )
#discretize_chimerge! , #discretize_equal_frequency! , #discretize_equal_width!
Methods included from Normalizer
#normalize_log! , #normalize_min_max! , #normalize_zscore!
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (BaseContinuous ) initialize (data = nil)
initialize from an existing data structure
+ # File 'lib/fselector/base_continuous.rb', line 17
+def initialize ( data = nil )
+ super ( data )
\ No newline at end of file
Class: FSelector::BaseDiscrete
+ — Documentation by YARD 0.7.5
+ Class: FSelector::BaseDiscrete
+ Inherits:
+ Base
+ Object
+ Base
+ FSelector::BaseDiscrete
+ show all
+ Defined in:
+ lib/fselector/base_discrete.rb
base ranking alogrithm for handling discrete feature
2 x 2 contingency table
+ c c'
+ ---------
+ f | A | B | A+B
+ |---|---|
+ f' | C | D | C+D
+ ---------
+ A+C B+D N = A+B+C+D
+ P(f) = (A+B)/N
+ P(f') = (C+D)/N
+ P(c) = (A+C)/N
+ P(c') = (B+D)/N
+ P(f,c) = A/N
+ P(f,c') = B/N
+ P(f',c) = C/N
+ P(f',c') = D/N
Direct Known Subclasses
Accuracy , AccuracyBalanced , BiNormalSeparation , ChiSquaredTest , CorrelationCoefficient , DocumentFrequency , F1Measure , FishersExactTest , GMean , GSSCoefficient , GiniIndex , InformationGain , MatthewsCorrelationCoefficient , McNemarsTest , MutualInformation , OddsRatio , OddsRatioNumerator , Power , Precision , ProbabilityRatio , Random , ReliefF_d , Relief_d , Sensitivity , Specificity
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (BaseDiscrete ) initialize (data = nil)
initialize from an existing data structure
+ # File 'lib/fselector/base_discrete.rb', line 29
+def initialize ( data = nil )
+ super ( data )
\ No newline at end of file
Class: FSelector::BiNormalSeparation
+ — Documentation by YARD 0.7.5
+ Class: FSelector::BiNormalSeparation
+ Inherits:
+ BaseDiscrete
+ show all
+ Includes:
+ Rubystats
+ Defined in:
+ lib/fselector/algo_discrete/BiNormalSeparation.rb
Constant Summary
Constant Summary
Constants included
+ from Rubystats
Rubystats::MAX_VALUE , Rubystats::SQRT2 , Rubystats::SQRT2PI , Rubystats::TWO_PI
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FSelector::ChiSquaredTest
+ — Documentation by YARD 0.7.5
+ Class: FSelector::ChiSquaredTest
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/fselector/algo_discrete/ChiSquaredTest.rb
Chi-Squared test (CHI)
N * ( P(f,c) * P(f',c') - P(f,c') * P(f',c) )^2
+ CHI(f,c) = -------------------------------------------------
+ P(f) * P(f') * P(c) * P(c')
+ N * (A*D - B*C)^2
+ = -------------------------------
+ (A+B) * (C+D) * (A+C) * (B+D)
suitable for large samples and
+none of the values of (A, B, C, D) < 5
ref: Wikipedia
+ and A Comparative Study on Feature Selection Methods for
+ Drug Discovery
+ Instance Method Summary
+ (collapse )
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
Constructor Details
+ - (ChiSquaredTest ) initialize (correction = nil, data = nil)
+ # File 'lib/fselector/algo_discrete/ChiSquaredTest.rb', line 30
+def initialize ( correction = nil , data = nil )
+ super ( data )
+ @correction = ( correction == :yates ) ? true : false
\ No newline at end of file
Class: FSelector::CorrelationCoefficient
+ — Documentation by YARD 0.7.5
+ Class: FSelector::CorrelationCoefficient
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/fselector/algo_discrete/CorrelationCoefficient.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FSelector::DocumentFrequency
+ — Documentation by YARD 0.7.5
+ Class: FSelector::DocumentFrequency
+ Inherits:
+ BaseDiscrete
+ show all
+ Defined in:
+ lib/fselector/algo_discrete/DocumentFrequency.rb
Method Summary
Methods inherited from Base
#each_class , #each_feature , #each_sample , #get_classes , #get_data , #get_feature_ranks , #get_feature_scores , #get_feature_values , #get_features , #get_opt , #get_sample_size , #initialize , #print_feature_ranks , #print_feature_scores , #select_data_by_rank! , #select_data_by_score! , #set_classes , #set_data , #set_feature_score , #set_features , #set_opt
Methods included from FileIO
#data_from_csv , #data_from_libsvm , #data_from_random , #data_from_weka , #data_to_csv , #data_to_libsvm , #data_to_weka
\ No newline at end of file
Class: FSelector::Ensemble
+ — Documentation by YARD 0.7.5
+ Class: FSelector::Ensemble
+ Inherits:
+ Base
+ Object
+ Base
+ - (Object ) by_min (arr)
by min value of an array
+ # File 'lib/fselector/ensemble.rb', line 132
+def by_min ( arr )
+ arr . min if arr . class == Array
\ No newline at end of file
+ - (Object ) data_to_weka (fname = :stdout, format = nil)
write to WEKA ARFF file
+ # File 'lib/fselector/fileio.rb', line 361
+def data_to_weka ( fname = :stdout , format = nil )
+ if fname == :stdout
+ ofs = $stdout
+ else
+ ofs = File . open ( fname , ' w ' )
+ end
+ = get_opt ( ' COMMENTS ' )
+ if
+ ofs . puts . join ( " \n " )
+ ofs . puts
+ end
+ relation = get_opt ( ' @RELATION ' )
+ if relation
+ ofs . puts " @RELATION #{ relation } "
+ else
+ ofs . puts " @RELATION data_gen_by_FSelector "
+ end
+ ofs . puts
+ each_feature do | f |
+ ofs . print " @ATTRIBUTE #{ f } "
+ type = get_opt ( f )
+ if type
+ if type == ' NOMINAL '
+ ofs . puts " { #{ get_feature_values ( f ) . uniq . sort . join ( ' , ' ) } } "
+ else
+ ofs . puts type
+ end
+ else ofs . puts " STRING "
+ end
+ end
+ ofs . puts " @ATTRIBUTE class { #{ get_classes . join ( ' , ' ) } } "
+ ofs . puts
+ ofs . puts " @DATA "
+ each_sample do | k , s |
+ if format == :sparse ofs . print " { "
+ get_features . each_with_index do | f , i |
+ if s . has_key? f
+ ofs . print " #{ i } #{ s [ f ] } , " if not s [ f ] . zero?
+ else ofs . print " #{ i } ?, "
+ end
+ end
+ ofs . print " #{ get_features . size } #{ k } "
+ ofs . puts " } "
+ else
+ each_feature do | f |
+ if s . has_key? f
+ ofs . print " #{ s [ f ] } , "
+ else ofs . print " ?, "
+ end
+ end
+ ofs . puts " #{ k } "
+ end
+ end
+ ofs . close if not ofs == $stdout
\ No newline at end of file
