Replies: 1 comment
-
To improve the LLM performance in understanding and answering questions based on table data from an image, follow these steps:
Example Code:Step 1: Extract Table Data from Imageimport pytesseract
from PIL import Image
# Load the image
image = Image.open('tournament_standings.png')
# Use Tesseract to extract text
extracted_text = pytesseract.image_to_string(image) Step 2: Structure the Extracted DataAssume the extracted text is structured into a CSV format: import pandas as pd
from io import StringIO
# Example extracted text
csv_data = """
Team,Played,Won,Drawn,Lost,Points
Team A,10,8,1,1,25
Team B,10,7,2,1,23
Team C,10,6,3,1,21
"""
# Convert to DataFrame
df = pd.read_csv(StringIO(csv_data)) Step 3: Load the Structured Data using LangChain's Document Loadersfrom langchain.document_loaders import CSVLoader
# Load the CSV data
loader = CSVLoader(file_path='tournament_standings.csv')
documents = loader.load() Step 4: Use Prompt Templates and Example Selectorsfrom langchain.prompts import PromptTemplate
# Create a prompt template
template = """
Given the following sports tournament standings:
{standings}
Answer the following question: {question}
"""
prompt = PromptTemplate(template=template, input_variables=["standings", "question"])
# Format the prompt with the standings and a sample question
formatted_prompt = prompt.format(standings=df.to_string(), question="Which team has the most points?") Step 5: Implement Output Parsersfrom langchain.output_parsers import StructuredOutputParser
# Define a simple output parser
class SimpleOutputParser(StructuredOutputParser):
def parse(self, text: str) -> dict:
# Custom parsing logic
return {"answer": text.strip()}
# Use the output parser
parser = SimpleOutputParser()
response = parser.parse("Team A has the most points with 25 points.") By following these steps and best practices, you can significantly improve the LLM's performance in understanding and answering questions based on table data from an image of sports tournament standings [1][2][3][4]. |
Beta Was this translation helpful? Give feedback.
-
Checked other resources
Commit to Help
Example Code
Description
I have a table in an image. It's a Group A match result for Euro Cup 2024. I'd like to feed this data into vector database and then answer question based on this. I'm either convert it to a pdf or just use png format. But both failed. The data has been embedded and added to the vector database. But LLM is not able to answer question based on this data. It seems the LLM doesn't understand the data. Can anybody advise what I should do to improve the LLM performance and understand my data like this?
System Info
Beta Was this translation helpful? Give feedback.
All reactions