Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Improve path handling and type annotations in FaissVectorStoreComponent #6081

Merged
merged 12 commits into from
Feb 12, 2025

Conversation

Cristhianzl
Copy link
Member

This pull request includes several changes to the FaissVectorStoreComponent class in the src/backend/base/langflow/components/vectorstores/faiss.py file. These changes aim to improve the handling of file paths, enhance type annotations, and streamline the search functionality.

Improvements to file path handling:

  • Added a new method resolve_path to resolve paths relative to the Langflow root.
  • Updated the build_vector_store method to use the new resolve_path method and ensure the directory exists before saving the FAISS index. [1] [2]

Enhancements to type annotations:

  • Changed the return type of the search_documents method from list[Data] to List[Data] for consistency with other type annotations.

Streamlining search functionality:

  • Simplified the search_documents method by removing unnecessary logging and directly returning the search results.

#6072

…d file path handling

🐛 (faiss.py): fix issue with building vector store when persist_directory is not provided
🐛 (faiss.py): fix issue with loading FAISS index when index file does not exist
📝 (faiss.py): add type hints for search_documents method parameters and return value
📝 (faiss.py): remove unnecessary logging statements from search_documents method
@Cristhianzl Cristhianzl self-assigned this Feb 3, 2025
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Feb 3, 2025
@github-actions github-actions bot added the bug Something isn't working label Feb 3, 2025
@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Feb 3, 2025
…rectory more efficiently

🔧 (faiss.py): refactor search_documents method to handle persist_directory more efficiently
@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Feb 3, 2025
@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Feb 3, 2025
@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Feb 3, 2025
…d persist directory path or current directory if not set

♻️ (faiss.py): refactor build_vector_store and search_documents methods to use get_persist_directory method for path resolution
@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Feb 12, 2025
@@ -1,3 +1,5 @@
from pathlib import Path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from pathlib import Path
from functools import cache
from pathlib import Path

@@ -44,16 +46,23 @@
),
]

def resolve_path(self, path: str) -> Path:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def resolve_path(self, path: str) -> Path:
@cache
def resolve_path(self, path: str) -> Path:

Copy link
Contributor

codeflash-ai bot commented Feb 12, 2025

⚡️ Codeflash found optimizations for this PR

📄 1,026% (10.26x) speedup for FaissVectorStoreComponent.get_persist_directory in src/backend/base/langflow/components/vectorstores/faiss.py

⏱️ Runtime : 28.0 milliseconds 2.49 milliseconds (best of 112 runs)

📝 Explanation and details

To optimize the provided Python program for faster execution, we'll make use of caching to avoid unnecessary repeated computations. In this context, resolving a path can be cached if the same directory path is requested multiple times. We will use the functools.lru_cache to achieve this.

Here's the optimized version of the program.

By decorating the resolve_path method with @lru_cache, we cache the results of resolving the path. This way, if resolve_path is called multiple times with the same argument, it will return the cached result instead of performing the resolution again. This reduces the number of file system operations, thereby improving performance. The maxsize=None parameter allows an unlimited cache size, which ensures that all unique paths are cached.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 1029 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage undefined
🌀 Generated Regression Tests Details
import os
from pathlib import Path

# imports
import pytest  # used for our unit tests
# function to test
from langflow.base.vectorstores.model import LCVectorStoreComponent
from langflow.components.vectorstores.faiss import FaissVectorStoreComponent

# unit tests

# Basic Functionality
def test_persist_directory_relative():
    component = FaissVectorStoreComponent()
    component.persist_directory = "data/persist"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_absolute():
    component = FaissVectorStoreComponent()
    component.persist_directory = "/var/lib/data/persist"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_not_set():
    component = FaissVectorStoreComponent()
    component.persist_directory = None
    codeflash_output = component.get_persist_directory()

def test_persist_directory_empty_string():
    component = FaissVectorStoreComponent()
    component.persist_directory = ""
    codeflash_output = component.get_persist_directory()

# Edge Cases

def test_persist_directory_with_special_chars():
    component = FaissVectorStoreComponent()
    component.persist_directory = "data/!@#$%^&*()"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_with_env_vars(monkeypatch):
    monkeypatch.setenv("HOME", "/home/user")
    component = FaissVectorStoreComponent()
    component.persist_directory = "$HOME/data/persist"
    codeflash_output = component.get_persist_directory()

# Filesystem Edge Cases
def test_non_existent_directory():
    component = FaissVectorStoreComponent()
    component.persist_directory = "/non/existent/directory"
    codeflash_output = component.get_persist_directory()


def test_long_path_names():
    component = FaissVectorStoreComponent()
    long_path = "/a" * 255
    component.persist_directory = long_path
    codeflash_output = component.get_persist_directory()

def test_high_frequency_calls():
    component = FaissVectorStoreComponent()
    component.persist_directory = "data/persist"
    for _ in range(1000):
        codeflash_output = component.get_persist_directory()

# Platform-Specific Scenarios
def test_windows_path(monkeypatch):
    component = FaissVectorStoreComponent()
    component.persist_directory = "C:\\data\\persist"
    codeflash_output = component.get_persist_directory()

def test_unix_path():
    component = FaissVectorStoreComponent()
    component.persist_directory = "/var/lib/data/persist"
    codeflash_output = component.get_persist_directory()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

from pathlib import Path

# imports
import pytest  # used for our unit tests
# function to test
from langflow.base.vectorstores.model import LCVectorStoreComponent
from langflow.components.vectorstores.faiss import FaissVectorStoreComponent

# unit tests

class TestFaissVectorStoreComponent:
    @pytest.fixture
    def component(self):
        # Fixture to create an instance of FaissVectorStoreComponent
        return FaissVectorStoreComponent()

    def test_persist_directory_set_to_valid_relative_path(self, component):
        # Test with a valid relative path
        component.persist_directory = "data"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_valid_absolute_path(self, component):
        # Test with a valid absolute path
        component.persist_directory = "/tmp/data"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_not_set(self, component):
        # Test when persist_directory is not set
        component.persist_directory = None
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_special_characters(self, component):
        # Test with special character paths
        component.persist_directory = "."
        codeflash_output = component.get_persist_directory()

        component.persist_directory = ".."
        codeflash_output = component.get_persist_directory()

        component.persist_directory = "./data"
        codeflash_output = component.get_persist_directory()

        component.persist_directory = "../data"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_non_string_type(self, component):
        # Test with non-string types
        component.persist_directory = 123
        with pytest.raises(TypeError):
            component.get_persist_directory()

        component.persist_directory = ["data"]
        with pytest.raises(TypeError):
            component.get_persist_directory()

        component.persist_directory = {"path": "data"}
        with pytest.raises(TypeError):
            component.get_persist_directory()

    def test_persist_directory_set_to_non_existent_path(self, component):
        # Test with a non-existent path
        component.persist_directory = "nonexistent/path"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_path_with_permission_issues(self, component):
        # Test with a path that likely has permission issues
        component.persist_directory = "/root/data"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_windows_specific_path(self, component):
        # Test with Windows-specific paths
        component.persist_directory = "C:\\Program Files\\data"
        codeflash_output = component.get_persist_directory()

        component.persist_directory = "C:\\Users\\User\\data"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_unix_specific_path(self, component):
        # Test with Unix-specific paths
        component.persist_directory = "/usr/local/data"
        codeflash_output = component.get_persist_directory()

        component.persist_directory = "~/data"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_deeply_nested_path(self, component):
        # Test with a deeply nested path
        component.persist_directory = "a/" * 100 + "data"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_path_with_symlinks(self, component):
        # Test with a path containing symlinks
        component.persist_directory = "/tmp/symlink_to_data"
        codeflash_output = component.get_persist_directory()

    def test_persist_directory_set_to_path_with_env_variables(self, component, monkeypatch):
        # Test with a path containing environment variables
        monkeypatch.setenv("HOME", "/home/user")
        component.persist_directory = "$HOME/data"
        codeflash_output = component.get_persist_directory()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

Codeflash

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 12, 2025
@Cristhianzl Cristhianzl removed the lgtm This PR has been approved by a maintainer label Feb 12, 2025
@Cristhianzl Cristhianzl added the lgtm This PR has been approved by a maintainer label Feb 12, 2025
@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Feb 12, 2025
@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Feb 12, 2025
@github-actions github-actions bot added bug Something isn't working and removed bug Something isn't working labels Feb 12, 2025
@@ -1,3 +1,5 @@
from pathlib import Path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from pathlib import Path
from functools import cache
from pathlib import Path

@@ -44,16 +46,30 @@
),
]

@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@staticmethod
@staticmethod
@cache

Copy link
Contributor

codeflash-ai bot commented Feb 12, 2025

⚡️ Codeflash found optimizations for this PR

📄 12,926% (129.26x) speedup for FaissVectorStoreComponent.resolve_path in src/backend/base/langflow/components/vectorstores/faiss.py

⏱️ Runtime : 6.49 milliseconds 49.8 microseconds (best of 93 runs)

📝 Explanation and details

Certainly! To optimize the given code, let's focus on minimizing the overhead associated with frequently-resolved paths. Specifically, we'll cache the resolved paths to avoid redundant computations.

Here is the optimized version.

Explanation.

  1. Cached Path Resolution: By applying the @lru_cache(maxsize=None) decorator, we cache the output of the resolve_path method. This means that if the same path is resolved multiple times, it will only be computed once, and subsequent calls will retrieve the result from the cache. This improves efficiency, especially when resolving the same paths repeatedly.
  2. Minimal Changes: The function signature, logic, and overall structure of the resolve_path method remain the same, ensuring the return values are as expected.

This should result in a more efficient path resolution process, particularly in scenarios where the same paths are being resolved multiple times.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 67 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage undefined
🌀 Generated Regression Tests Details
from pathlib import Path

# imports
import pytest  # used for our unit tests
# function to test
from langflow.base.vectorstores.model import LCVectorStoreComponent
from langflow.components.vectorstores.faiss import FaissVectorStoreComponent


# unit tests
@pytest.mark.parametrize("input_path, expected_output", [
    # Basic Functionality
    ("/usr/local/bin", str(Path("/usr/local/bin").resolve())),
    ("C:\\Program Files\\Python", str(Path("C:\\Program Files\\Python").resolve())),
    ("./my_folder", str(Path("./my_folder").resolve())),
    ("../parent_folder", str(Path("../parent_folder").resolve())),
    
    # Edge Cases
    ("", str(Path("").resolve())),
    (".", str(Path(".").resolve())),
    ("..", str(Path("..").resolve())),
    
    # Complex Relative Paths
    ("folder/../another_folder", str(Path("folder/../another_folder").resolve())),
    ("./folder/../another_folder/./sub_folder", str(Path("./folder/../another_folder/./sub_folder").resolve())),
    # Note: Symbolic link tests would require setup in the test environment
    
    # Invalid Paths
    ("folder/illegal\0name", None),  # This should raise an exception
    ("folder/illegal|name", None),   # This should raise an exception
    ("non_existent_folder/non_existent_file", str(Path("non_existent_folder/non_existent_file").resolve())),
    
    # Platform-Specific Paths
    ("C:\\Users\\User\\Documents", str(Path("C:\\Users\\User\\Documents").resolve())),
    ("C:/Users/User/Documents", str(Path("C:/Users/User/Documents").resolve())),
    ("/home/user/documents", str(Path("/home/user/documents").resolve())),
    ("/var/log/syslog", str(Path("/var/log/syslog").resolve())),
    
    # Large Path Strings
    ("folder/" * 1000 + "file", str(Path("folder/" * 1000 + "file").resolve())),
    
    # Path with Special Characters
    ("folder with spaces/file", str(Path("folder with spaces/file").resolve())),
    ("folder_with_special_chars/!@#$%^&*()_+{}|:<>?~", str(Path("folder_with_special_chars/!@#$%^&*()_+{}|:<>?~").resolve())),
    
    # Path with Environment Variables
    # Note: These tests assume the environment variables are set correctly
    (str(Path.home() / "Documents"), str((Path.home() / "Documents").resolve())),
])
def test_resolve_path(input_path, expected_output):
    # Check if the expected output is None, which means we expect an exception
    if expected_output is None:
        with pytest.raises(ValueError):
            FaissVectorStoreComponent.resolve_path(input_path)
    else:
        # Otherwise, check if the resolved path matches the expected output
        codeflash_output = FaissVectorStoreComponent.resolve_path(input_path)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import os
from pathlib import Path

# imports
import pytest  # used for our unit tests
# function to test
from langflow.base.vectorstores.model import LCVectorStoreComponent
from langflow.components.vectorstores.faiss import FaissVectorStoreComponent

# unit tests

def test_basic_absolute_path():
    # Test with an absolute path
    codeflash_output = FaissVectorStoreComponent.resolve_path("/usr/local/bin")
    codeflash_output = FaissVectorStoreComponent.resolve_path("/home/user/docs")

def test_relative_path():
    # Test with a relative path
    codeflash_output = FaissVectorStoreComponent.resolve_path("./folder")
    codeflash_output = FaissVectorStoreComponent.resolve_path("../parent_folder")

def test_path_with_symbolic_links():
    # Test with a path that includes symbolic links
    codeflash_output = FaissVectorStoreComponent.resolve_path("/var/log/../log/syslog")
    codeflash_output = FaissVectorStoreComponent.resolve_path("~/symlink_to_docs")

def test_path_with_current_directory():
    # Test with a path that includes the current directory symbol
    codeflash_output = FaissVectorStoreComponent.resolve_path("./file.txt")
    codeflash_output = FaissVectorStoreComponent.resolve_path("./folder/./file.txt")

def test_path_with_parent_directory():
    # Test with a path that includes the parent directory symbol
    codeflash_output = FaissVectorStoreComponent.resolve_path("../file.txt")
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder/../file.txt")

def test_home_directory_shortcut():
    # Test with a path that includes the home directory shortcut
    codeflash_output = FaissVectorStoreComponent.resolve_path("~/file.txt")
    codeflash_output = FaissVectorStoreComponent.resolve_path("~/folder/file.txt")

def test_empty_path():
    # Test with an empty path string
    codeflash_output = FaissVectorStoreComponent.resolve_path("")

def test_path_with_special_characters():
    # Test with a path that includes special characters
    codeflash_output = FaissVectorStoreComponent.resolve_path("/path/with/special!@#$/file.txt")
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder/with space/file.txt")

def test_non_existent_path():
    # Test with a path that does not exist in the filesystem
    codeflash_output = FaissVectorStoreComponent.resolve_path("/non/existent/path/file.txt")
    codeflash_output = FaissVectorStoreComponent.resolve_path("non_existent_folder/file.txt")

def test_path_with_mixed_separators():
    # Test with a path that uses mixed directory separators
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder\\subfolder/file.txt")
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder/subfolder\\file.txt")

def test_root_directory_path():
    # Test with the root directory path
    codeflash_output = FaissVectorStoreComponent.resolve_path("/")
    if os.name == 'nt':  # Windows-specific test
        codeflash_output = FaissVectorStoreComponent.resolve_path("C:\\")

def test_path_with_drive_letters():
    # Test with paths that include drive letters (Windows)
    if os.name == 'nt':  # Windows-specific test
        codeflash_output = FaissVectorStoreComponent.resolve_path("C:\\Users\\User")
        codeflash_output = FaissVectorStoreComponent.resolve_path("D:\\Folder\\File.txt")

def test_path_with_environment_variables():
    # Test with paths that include environment variables
    codeflash_output = FaissVectorStoreComponent.resolve_path(os.path.expandvars("$HOME/file.txt"))
    if os.name == 'nt':  # Windows-specific test
        codeflash_output = FaissVectorStoreComponent.resolve_path(os.path.expandvars("%USERPROFILE%\\file.txt"))

def test_large_path():
    # Test with a very long path
    long_path = "/a/" + "b" * 255 + "/c"
    codeflash_output = FaissVectorStoreComponent.resolve_path(long_path)
    long_path = "folder/" + "subfolder/" * 50 + "file.txt"
    codeflash_output = FaissVectorStoreComponent.resolve_path(long_path)

def test_path_with_trailing_separators():
    # Test with paths that end with a directory separator
    codeflash_output = FaissVectorStoreComponent.resolve_path("/home/user/")
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder/subfolder/")

def test_path_with_multiple_consecutive_separators():
    # Test with paths that contain multiple consecutive directory separators
    codeflash_output = FaissVectorStoreComponent.resolve_path("/home//user///docs")
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder//subfolder///file.txt")

def test_path_with_only_separators():
    # Test with paths that consist entirely of directory separators
    codeflash_output = FaissVectorStoreComponent.resolve_path("////")
    if os.name == 'nt':  # Windows-specific test
        codeflash_output = FaissVectorStoreComponent.resolve_path("\\\\")

def test_path_with_special_files():
    # Test with paths that are just the current or parent directory symbols
    codeflash_output = FaissVectorStoreComponent.resolve_path(".")
    codeflash_output = FaissVectorStoreComponent.resolve_path("..")

def test_path_with_mixed_case_sensitivity():
    # Test with paths that have mixed case sensitivity
    codeflash_output = FaissVectorStoreComponent.resolve_path("/Home/UsEr/DoCs")
    codeflash_output = FaissVectorStoreComponent.resolve_path("FoLdEr/FiLe.TxT")

def test_path_with_unicode_characters():
    # Test with paths that include Unicode characters
    codeflash_output = FaissVectorStoreComponent.resolve_path("/home/用户/文件")
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder/📁/file.txt")

def test_path_with_reserved_characters():
    # Test with paths that include reserved characters (Windows)
    if os.name == 'nt':  # Windows-specific test
        codeflash_output = FaissVectorStoreComponent.resolve_path("C:\\folder\\file<name>.txt")
        codeflash_output = FaissVectorStoreComponent.resolve_path("C:\\folder\\file|name.txt")

def test_path_with_control_characters():
    # Test with paths that include control characters
    codeflash_output = FaissVectorStoreComponent.resolve_path("/home/user/file\nname.txt")
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder/subfolder/file\tname.txt")


def test_path_with_special_device_files():
    # Test with paths that refer to special device files (Unix)
    if os.name != 'nt':  # Unix-specific test
        codeflash_output = FaissVectorStoreComponent.resolve_path("/dev/null")
        codeflash_output = FaissVectorStoreComponent.resolve_path("/dev/tty")

def test_path_with_network_locations():
    # Test with paths that refer to network locations or mounted network drives
    codeflash_output = FaissVectorStoreComponent.resolve_path("//server/share/folder/file.txt")
    if os.name == 'nt':  # Windows-specific test
        codeflash_output = FaissVectorStoreComponent.resolve_path("\\\\server\\share\\folder\\file.txt")

def test_path_with_spaces_and_tabs():
    # Test with paths that include spaces and tabs
    codeflash_output = FaissVectorStoreComponent.resolve_path("/home/user/my documents/file.txt")
    codeflash_output = FaissVectorStoreComponent.resolve_path("folder/subfolder/file name.txt")

def test_path_with_special_mount_points():
    # Test with paths that refer to special mount points (Unix)
    if os.name != 'nt':  # Unix-specific test
        codeflash_output = FaissVectorStoreComponent.resolve_path("/mnt/external_drive/file.txt")
        codeflash_output = FaissVectorStoreComponent.resolve_path("/media/user/USB/file.txt")

def test_path_with_hidden_files_or_directories():
    # Test with paths that refer to hidden files or directories
    codeflash_output = FaissVectorStoreComponent.resolve_path("/home/user/.hidden_file")
    codeflash_output = FaissVectorStoreComponent.resolve_path("/home/user/.hidden_folder/file.txt")
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

Codeflash

@@ -1,3 +1,5 @@
from pathlib import Path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from pathlib import Path
from functools import cache
from pathlib import Path

@@ -44,16 +46,30 @@
),
]

@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@staticmethod
@staticmethod
@cache

Comment on lines +62 to +64
if self.persist_directory:
return Path(self.resolve_path(self.persist_directory))
return Path()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.persist_directory:
return Path(self.resolve_path(self.persist_directory))
return Path()
return Path(self.resolve_path(self.persist_directory)) if self.persist_directory else Path()

Copy link
Contributor

codeflash-ai bot commented Feb 12, 2025

⚡️ Codeflash found optimizations for this PR

📄 2,643% (26.43x) speedup for FaissVectorStoreComponent.get_persist_directory in src/backend/base/langflow/components/vectorstores/faiss.py

⏱️ Runtime : 5.49 milliseconds 200 microseconds (best of 67 runs)

📝 Explanation and details

To optimize the given program for performance, we need to make sure that the existing logic is maintained but streamlined where possible. The program can benefit from caching mechanisms for repetitive calculations such as resolving paths. Let's add functools.lru_cache to cache the resolved paths, remove unnecessary instance attributes accesses by using class attributes like self.persist_directory, and more precise function definitions. Here is an optimized version of the code.

Optimizations and Changes.

  1. Path Resolution Caching: Added lru_cache to the resolve_path method to cache resolved paths for subsequent faster access.
  2. Remove Unneeded Check: The check for self.persist_directory can be encapsulated inline inside the return statement.
  3. Efficiency Improvements: Used fewer operations and direct returns where possible for better performance and code clarity.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 50 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage undefined
🌀 Generated Regression Tests Details
from pathlib import Path

# imports
import pytest  # used for our unit tests
# function to test
from langflow.base.vectorstores.model import LCVectorStoreComponent
from langflow.components.vectorstores.faiss import FaissVectorStoreComponent

# unit tests

def test_persist_directory_absolute_path():
    """Test with an absolute path"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "/home/user/data"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_relative_path():
    """Test with a relative path"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "./data"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_not_set():
    """Test when persist_directory is not set"""
    component = FaissVectorStoreComponent()
    component.persist_directory = None
    codeflash_output = component.get_persist_directory()

def test_persist_directory_empty_string():
    """Test when persist_directory is an empty string"""
    component = FaissVectorStoreComponent()
    component.persist_directory = ""
    codeflash_output = component.get_persist_directory()

def test_persist_directory_non_existent_path():
    """Test with a non-existent path"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "/path/that/does/not/exist"
    codeflash_output = component.get_persist_directory()


def test_persist_directory_special_characters():
    """Test with a path containing special characters"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "./data with spaces"
    codeflash_output = component.get_persist_directory()


def test_persist_directory_mixed_separators():
    """Test with a path containing mixed separators"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "folder/subfolder\\file"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_unix_like_on_windows():
    """Test Unix-like path on Windows"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "/mnt/c/Users/User/data"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_windows_on_unix():
    """Test Windows path on Unix"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "C:\\Users\\User\\data"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_deeply_nested():
    """Test with deeply nested directories"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "dir1/dir2/dir3/dir4/dir5/dir6"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_long_path():
    """Test with a long path name"""
    long_path = "a_very_long_directory_name_that_exceeds_normal_length_limits"
    component = FaissVectorStoreComponent()
    component.persist_directory = long_path
    codeflash_output = component.get_persist_directory()

def test_persist_directory_current_directory():
    """Test with the current directory"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "."
    codeflash_output = component.get_persist_directory()

def test_persist_directory_parent_directory():
    """Test with the parent directory"""
    component = FaissVectorStoreComponent()
    component.persist_directory = ".."
    codeflash_output = component.get_persist_directory()

def test_persist_directory_root_directory():
    """Test with the root directory"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "/"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_symlink():
    """Test with a symlink to a directory"""
    symlink_path = "/path/to/symlink"
    component = FaissVectorStoreComponent()
    component.persist_directory = symlink_path
    codeflash_output = component.get_persist_directory()

def test_persist_directory_empty_string():
    """Test with an empty string"""
    component = FaissVectorStoreComponent()
    component.persist_directory = ""
    codeflash_output = component.get_persist_directory()

def test_persist_directory_whitespace_string():
    """Test with a whitespace string"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "   "
    codeflash_output = component.get_persist_directory()

def test_persist_directory_trailing_slash():
    """Test with a trailing slash"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "/home/user/data/"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_dot_notation():
    """Test with dot notation"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "./"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_double_dot_notation():
    """Test with double dot notation"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "../"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_network_path():
    """Test with a network path"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "\\\\network\\share\\folder"
    codeflash_output = component.get_persist_directory()

def test_persist_directory_device_path():
    """Test with a device path"""
    component = FaissVectorStoreComponent()
    component.persist_directory = "/dev/sda1"
    codeflash_output = component.get_persist_directory()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import os
from pathlib import Path

# imports
import pytest  # used for our unit tests
# function to test
from langflow.base.vectorstores.model import LCVectorStoreComponent
from langflow.components.vectorstores.faiss import FaissVectorStoreComponent

# unit tests

def test_basic_functionality():
    """Test basic functionality with valid paths."""
    component = FaissVectorStoreComponent()
    
    # Valid relative path
    component.persist_directory = "data/store"
    codeflash_output = component.get_persist_directory()
    
    # Valid absolute path
    component.persist_directory = "/home/user/data/store"
    codeflash_output = component.get_persist_directory()
    
    # Not set (None)
    component.persist_directory = None
    codeflash_output = component.get_persist_directory()
    
    # Not set (empty string)
    component.persist_directory = ""
    codeflash_output = component.get_persist_directory()

def test_path_resolution():
    """Test path resolution with special characters, spaces, and relative components."""
    component = FaissVectorStoreComponent()
    
    # Path with spaces
    component.persist_directory = "data with spaces/store"
    codeflash_output = component.get_persist_directory()
    
    # Path with special characters
    component.persist_directory = "data/@special!#$/store"
    codeflash_output = component.get_persist_directory()
    
    # Relative path components
    component.persist_directory = "../data/store"
    codeflash_output = component.get_persist_directory()
    
    component.persist_directory = "./data/store"
    codeflash_output = component.get_persist_directory()


def test_edge_cases():
    """Test edge cases with maximum length and unusual characters."""
    component = FaissVectorStoreComponent()
    
    # Maximum length path
    long_path = "a" * 255
    component.persist_directory = long_path
    codeflash_output = component.get_persist_directory()
    
    # Unusual but valid characters
    component.persist_directory = "data/üñîçødë/store"
    codeflash_output = component.get_persist_directory()

def test_environment_specific_paths():
    """Test paths with environment variables and tilde expansion."""
    component = FaissVectorStoreComponent()
    
    # Environment variable
    component.persist_directory = "$HOME/data/store"
    codeflash_output = component.get_persist_directory()
    
    # Tilde expansion
    component.persist_directory = "~/data/store"
    codeflash_output = component.get_persist_directory()

def test_filesystem_permissions():
    """Test paths without write permissions."""
    component = FaissVectorStoreComponent()
    
    # Path without write permissions
    component.persist_directory = "/root/protected/store"
    codeflash_output = component.get_persist_directory()

def test_large_scale():
    """Test deeply nested paths."""
    component = FaissVectorStoreComponent()
    
    # Deeply nested path
    nested_path = "a/" * 100 + "store"
    component.persist_directory = nested_path
    codeflash_output = component.get_persist_directory()

def test_cross_platform_paths():
    """Test cross-platform paths."""
    component = FaissVectorStoreComponent()
    
    # Windows-style path on Unix
    component.persist_directory = "C:\\Users\\user\\data\\store"
    codeflash_output = component.get_persist_directory()
    
    # Unix-style path on Windows
    component.persist_directory = "/mnt/c/Users/user/data/store"
    codeflash_output = component.get_persist_directory()

def test_unusual_filesystem_behavior():
    """Test paths that are symbolic links, mount points, or junction points."""
    component = FaissVectorStoreComponent()
    
    # Symbolic link
    component.persist_directory = "/symlink/to/store"
    codeflash_output = component.get_persist_directory()
    
    # Mount point
    component.persist_directory = "/mnt/external_drive/store"
    codeflash_output = component.get_persist_directory()
    
    # Junction point (Windows)
    component.persist_directory = "C:\\junction\\to\\store"
    codeflash_output = component.get_persist_directory()

def test_special_files():
    """Test paths that point to special files."""
    component = FaissVectorStoreComponent()
    
    # Device file
    component.persist_directory = "/dev/null"
    codeflash_output = component.get_persist_directory()
    
    # Named pipe
    component.persist_directory = "/tmp/named_pipe"
    codeflash_output = component.get_persist_directory()

def test_network_paths():
    """Test network paths."""
    component = FaissVectorStoreComponent()
    
    # SMB share
    component.persist_directory = "//network_share/data/store"
    codeflash_output = component.get_persist_directory()
    
    # NFS share
    component.persist_directory = "nfs://server/path/to/store"
    codeflash_output = component.get_persist_directory()

def test_concurrent_access():
    """Test concurrent access to the same path."""
    component1 = FaissVectorStoreComponent()
    component2 = FaissVectorStoreComponent()
    
    # Shared path
    shared_path = "/shared/data/store"
    component1.persist_directory = shared_path
    component2.persist_directory = shared_path
    
    codeflash_output = component1.get_persist_directory()
    codeflash_output = component2.get_persist_directory()

def test_reserved_names():
    """Test paths with reserved names on Windows."""
    component = FaissVectorStoreComponent()
    
    # Reserved name CON
    component.persist_directory = "C:\\CON\\store"
    codeflash_output = component.get_persist_directory()
    
    # Reserved name PRN
    component.persist_directory = "C:\\PRN\\store"
    codeflash_output = component.get_persist_directory()


def test_extremely_long_path():
    """Test extremely long path."""
    component = FaissVectorStoreComponent()
    
    # Extremely long path
    long_path = "a/" * 1000 + "store"
    component.persist_directory = long_path
    codeflash_output = component.get_persist_directory()

def test_mixed_separators():
    """Test paths with mixed directory separators."""
    component = FaissVectorStoreComponent()
    
    # Mixed separators
    component.persist_directory = "data\\store/with\\mixed/separators"
    codeflash_output = component.get_persist_directory()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

Codeflash

@Cristhianzl Cristhianzl added this pull request to the merge queue Feb 12, 2025
Merged via the queue into main with commit fda2f17 Feb 12, 2025
35 checks passed
@Cristhianzl Cristhianzl deleted the cz/fix-fass branch February 12, 2025 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working lgtm This PR has been approved by a maintainer size:M This PR changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants