[Bug]: sporadic multiobj optimization failure on boosting pipelines #1337

Lopa10ko · 2024-09-25T19:41:41Z

Current Behavior

Getting ValueError: Length of values (0) does not match length of index (606) on boosting pipelines predict stage in run_classification_multiobj_example

Steps to Reproduce

2024-09-24_21-25-09_pipeline_saved.zip

import pandas as pd

from fedot import Fedot
from fedot.core.pipelines.pipeline import Pipeline
from fedot.core.utils import fedot_project_root


def run_classification_multiobj_example(visualization=False, timeout=1, with_tuning=True):
    train_data = pd.read_csv(f'{fedot_project_root()}/examples/data/Hill_Valley_with_noise_Training.data')
    test_data = pd.read_csv(f'{fedot_project_root()}/examples/data/Hill_Valley_with_noise_Testing.data')
    target = test_data['class']
    del test_data['class']
    problem = 'classification'

    metric_names = ['f1', 'node_number']
    initial_assumption = Pipeline().load('./2024-09-24_21-25-09_pipeline_saved')
    auto_model = Fedot(problem=problem, timeout=timeout, preset='best_quality',
                       metric=metric_names,
                       with_tuning=with_tuning,
                       initial_assumption=initial_assumption)
    auto_model.fit(features=train_data, target='class')
    prediction = auto_model.predict_proba(features=test_data)
    print(auto_model.get_metrics(target))

    if visualization:
        auto_model.plot_prediction()
        auto_model.plot_pareto()

    return prediction


if __name__ == '__main__':
    run_classification_multiobj_example(timeout=0.1, with_tuning=False)

Possible solution

Check whether data.target is set to an empty sequence:

FEDOT/fedot/core/operations/evaluation/operation_implementations/models/boostings_implementations.py

Lines 236 to 254 in 5be9119

    
           @staticmethod 
        
           def convert_to_dataframe(data: Optional[InputData], identify_cats: bool): 
        
               dataframe = pd.DataFrame(data=data.features, columns=data.features_names) 
        
               if data.target is not None: 
        
                   dataframe['target'] = np.ravel(data.target) 
        
               else: 
        
                   # TODO: temp workaround in case data.target is set to None intentionally 
        
                   #  for test.integration.models.test_model.check_predict_correct 
        
                   dataframe['target'] = np.zeros(len(data.features)) 
        
               if identify_cats and data.categorical_idx is not None: 
        
                   for col in dataframe.columns[data.categorical_idx]: 
        
                       dataframe[col] = dataframe[col].astype('category') 
        
               if data.numerical_idx is not None: 
        
                   for col in dataframe.columns[data.numerical_idx]: 
        
                       dataframe[col] = dataframe[col].astype('float') 
        
               return dataframe.drop(columns=['target']), dataframe['target']

Context

failed integration tests run https://github.com/aimclub/FEDOT/actions/runs/11013973383
related to #1316

The text was updated successfully, but these errors were encountered:

Lopa10ko added the bug Something isn't working label Sep 25, 2024

Lopa10ko self-assigned this Sep 25, 2024

Lopa10ko mentioned this issue Sep 26, 2024

Improving preprocessing #1320

Merged

DRMPN closed this as completed in #1320 Nov 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: sporadic multiobj optimization failure on boosting pipelines #1337

[Bug]: sporadic multiobj optimization failure on boosting pipelines #1337

Lopa10ko commented Sep 25, 2024 •

edited

Loading

[Bug]: sporadic multiobj optimization failure on boosting pipelines #1337

[Bug]: sporadic multiobj optimization failure on boosting pipelines #1337

Comments

Lopa10ko commented Sep 25, 2024 • edited Loading

Current Behavior

Steps to Reproduce

Possible solution

Context

Lopa10ko commented Sep 25, 2024 •

edited

Loading