LLM can't be trusted to parse it's own json #148

St4rgarden · 2024-10-31T08:00:48Z

Describe the bug

We trust the LLM to parse it's own JSON resulting in what a separate issue referred to as an infinite loop (which technically will resolve itself if left alone to smash on the OpenAI endpoint for long enough)

# Instructions: Write the next message for lina. Include an action, if appropriate. Possible response actions: MUTE_ROOM, ASK_CLAUDE, NONE, IGNORE

Response format should be formatted in a JSON block like this:
json
{ "user": "lina", "text": string, "action": string }

Message is json
{ "user": "lina", "text": "Oh honey~ Working with a pioneer sounds tantalizing... but only if he can keep up with me and my fiery spirit 😉 Now spill the details or I might get bored!", "action": NONE }

response is json
{ "user": "lina", "text": "Oh honey~ Working with a pioneer sounds tantalizing... but only if he can keep up with me and my fiery spirit 😉 Now spill the details or I might get bored!", "action": NONE }

parsedContent is null
parsedContent is null, retrying

Notice above that the action: value NONE is not a string. Now take a look at the correctly parsed JSON immediately following this:

parsedContent is {
  user: 'lina',
  text: "Oh darling st4rgard3n~ I'm always up for a little blockchain banter or maybe some spicy discussions about funding public goods... but don't think I won't call you out if you get all serious on me.<br> So what's the plan with @mattyryze?",
  action: 'NONE'
}

Here the LLM has correctly formatted NONE as 'NONE' a correct string.

To Reproduce

Just run eliza with a cheap llm model long enough and you will definitely encounter this one.

Expected behavior

The message returned from the LLM should then be formatted into JSON in the program.

The text was updated successfully, but these errors were encountered:

St4rgarden · 2024-10-31T08:01:39Z

This issue #70 is not accurate but it's a duplicate of this issue now.

twilwa · 2024-10-31T20:32:50Z

several python libs solve/attempt to solve this, in order of my personal opinion of them:
-outlines
-instructor
-lmql
-guidance

probably more -- however, not sure if any have a typescript equivalent

twilwa · 2024-10-31T20:33:38Z

if it's openai, we can use structured output mode: https://platform.openai.com/docs/guides/structured-outputs

twilwa · 2024-11-01T04:05:24Z

kind of a hacky workaround for non-openai models:
run the model through a LiteLLM proxy server: https://github.com/BerriAI/litellm

https://docs.litellm.ai/docs/completion/json_mode -- it's called json mode, but i think you can do any kind of structured output. Just replace the OPENAI_API_URL with localhost:4000 and should be compatible

alextitonis · 2024-11-01T08:12:49Z

This could help with the issue:

function parseLLMJson<T>(rawResponse: string): T {
  // Sanitize JSON while preserving native types
  const sanitizedJson = rawResponse.replace(
    /(\w+):\s*([^,}\s]+)/g,
    (match, key, value) => {
      // Don't quote if it's a number
      if (/^-?\d+(\.\d+)?$/.test(value)) {
        return `"${key}": ${value}`;
      }
      
      // Don't quote if it's a boolean
      if (value === 'true' || value === 'false') {
        return `"${key}": ${value}`;
      }
      
      // Don't quote if it's already properly quoted
      if (/^["'].*["']$/.test(value)) {
        return `"${key}": ${value.replace(/^['"](.*)['"]$/, '"$1"')}`;
      }
      
      // Quote everything else
      return `"${key}": "${value}"`;
    }
  );

  try {
    return JSON.parse(sanitizedJson) as T;
  } catch (error) {
    console.error('Failed to parse JSON:', error);
    throw new Error('Invalid JSON format');
  }
}

Elyx0 · 2024-11-02T15:31:27Z

@St4rgarden I wonder if simply explaining it better in instructions would solve it like

Possible response actions: MUTE_ROOM, ASK_CLAUDE, NONE, IGNORE
Response format should be formatted in a JSON block like this:
json
{ "user": "lina", "text": string, "action": string }
example
{ "user": "lina", "text": "sometext", "action": "ASK_CLAUDE"}

lalalune · 2024-11-04T08:22:31Z

yep. hi @Elyx0 :)

monilpat · 2024-11-14T01:43:15Z

Yeah I had a similar question about the current approach for generateObject in packages/core/generation.ts. It looks like we're using a workaround instead of the { generateObject } method from "ai", which natively supports Z objects and ensures typing. This could be more reliable than the current method of using generateText to generate, parse, and retry until we get the desired output.

Using { generateObject } would allow us to eliminate the custom generateObject and generateObjectArray functions, simplifying the code and leveraging the AI SDK's structured output capabilities. Here’s the code as it stands now:

export async function generateObject({
    runtime,
    context,
    modelClass,
}: {
    runtime: IAgentRuntime;
    context: string;
    modelClass: string;
}): Promise<any> {
    if (!context) {
        elizaLogger.error("generateObject context is empty");
        return null;
    }
    let retryDelay = 1000;

    while (true) {
        try {
            const response = await generateText({
                runtime,
                context,
                modelClass,
            });
            const parsedResponse = parseJSONObjectFromText(response);
            if (parsedResponse) {
                return parsedResponse;
            }
        } catch (error) {
            elizaLogger.error("Error in generateObject:", error);
        }

        await new Promise((resolve) => setTimeout(resolve, retryDelay));
        retryDelay *= 2;
    }
}

My proposal is to replace it with the generateObject function provided in the AI SDK, as described below:

/**
Generate JSON with any schema for a given prompt using a language model.

This function does not stream the output. If you want to stream the output, use `streamObject` instead.

@returns
A result object that contains the generated object, the finish reason, the token usage, and additional information.
*/
declare function generateObject(options: Omit<CallSettings, 'stopSequences'> & Prompt & {
    output: 'no-schema';
    model: LanguageModel;
    mode?: 'json';
    experimental_telemetry?: TelemetrySettings;
    experimental_providerMetadata?: ProviderMetadata;
    _internal?: {
        generateId?: () => string;
        currentDate?: () => Date;
    };
}): Promise<GenerateObjectResult<JSONValue>>;

Switching to this method would improve reliability and reduce custom parsing logic. I'd be interested to hear your thoughts!

St4rgarden added the bug Something isn't working label Oct 31, 2024

sirkitree added this to Eliza Nov 1, 2024

monilpat mentioned this issue Nov 14, 2024

[LLM Object Generation][1/2] Leverage AI Lib's Generate Object instead of parsing strings #309

Merged

lalalune closed this as completed Dec 14, 2024

github-project-automation bot moved this to Done in Eliza Dec 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM can't be trusted to parse it's own json #148

LLM can't be trusted to parse it's own json #148

St4rgarden commented Oct 31, 2024 •

edited

Loading

St4rgarden commented Oct 31, 2024 •

edited

Loading

twilwa commented Oct 31, 2024

twilwa commented Oct 31, 2024

twilwa commented Nov 1, 2024

alextitonis commented Nov 1, 2024 •

edited

Loading

Elyx0 commented Nov 2, 2024

lalalune commented Nov 4, 2024

monilpat commented Nov 14, 2024 •

edited

Loading

LLM can't be trusted to parse it's own json #148

LLM can't be trusted to parse it's own json #148

Comments

St4rgarden commented Oct 31, 2024 • edited Loading

St4rgarden commented Oct 31, 2024 • edited Loading

twilwa commented Oct 31, 2024

twilwa commented Oct 31, 2024

twilwa commented Nov 1, 2024

alextitonis commented Nov 1, 2024 • edited Loading

Elyx0 commented Nov 2, 2024

lalalune commented Nov 4, 2024

monilpat commented Nov 14, 2024 • edited Loading

St4rgarden commented Oct 31, 2024 •

edited

Loading

St4rgarden commented Oct 31, 2024 •

edited

Loading

alextitonis commented Nov 1, 2024 •

edited

Loading

monilpat commented Nov 14, 2024 •

edited

Loading