Unable to get a simple example to work with Llamasharp and Semantic Kernel #186

bancroftway · 2023-09-29T21:10:09Z

I am using packages LLamasharp 0.5.1, LLamaSharp.semantic-kernel 0.5.1, and Microsoft.SemanticKernel.Core 0.24.230918.1-preview. In a very simple example inspired by this video, I am unable to get any results. Please advise what am I doing wrong?

static async Task Main(string[] args) {
     var modelPath = @"E:\Projects\Scratchpad\HuggingFace\llama-2-7b-guanaco-qlora.Q2_K.gguf";

     var seed = 1337;
     // Load weights into memory
     var parameters = new ModelParams(modelPath)
     {
         Seed = seed,
         EmbeddingMode = true
     };

     using var model = LLamaWeights.LoadFromFile(parameters);
     var embedding = new LLamaEmbedder(model, parameters);

     var kernel = Kernel.Builder
         .WithMemoryStorage(new VolatileMemoryStore())
         .WithAIService<ITextEmbeddingGeneration>("local-llama-embed", new LLamaSharpEmbeddingGeneration(embedding), true)
     .Build();

     var memories = new Dictionary<string, string>
     {
         {"rec1", "My name is Andy" },
         {"rec2", "I currently work as a tour guide" },
         {"rec3", "I have been living in Seattle since 2005" },
         {"rec4", "I have visited France and Italy five times since 2015" },
         {"rec5", "My family is from New York" },
     };

     foreach (var memory in memories)
     {
         await kernel.Memory.SaveInformationAsync(collection:"aboutme", id:memory.Key, text:memory.Value);
     }

     var query = "what is my name?";
     var results = kernel.Memory.SearchAsync("aboutme", query, 2);
     await foreach (var result in results)
     {
         Console.WriteLine(result.Metadata.Text);
         Console.WriteLine(result.Relevance);
     }


     query = "what do I do for work?";
     results = kernel.Memory.SearchAsync("aboutme", query, 2);
     await foreach (var result in results)
     {
         Console.WriteLine(result.Metadata.Text);
         Console.WriteLine(result.Relevance);
     }
 }

The text was updated successfully, but these errors were encountered:

negatron99 · 2023-10-03T10:41:35Z

Hi, using your code as source, I had to change the minRelevanceScore in SearchAsync to 0.1, the relevance it is returning is 0.203498270155788 for the "My name is Andy" result.

bancroftway · 2023-10-03T14:56:20Z

@negatron99 thanks for spotting this.

In your opinion, is the performance acceptable? For the second question, even setting the minrelevance to 0.1, does not result in correct memory being recalled. Is this a model issue, or an issue related to SK integration?

negatron99 · 2023-10-03T15:09:17Z

@bancroftway I got that too, the questions often returned with the expected answer in 2nd place.

I'm running an AI locally, and I think that has a severe impact because of the size of model I'm using.

(note: I'm new to this)

AsakusaRinne · 2023-11-03T17:09:01Z

@xbotter Could you please have a quick review for this issue to see if it's a problem of semantic-kernel integration or LLamaSharp it self?

xbotter · 2023-11-05T05:31:00Z

I think it is closely related to the model. I tried the following 5 models, and the similarity results obtained are as follows.

─────────────────────────── llama-2-13b.Q5_K_S.gguf ────────────────────────────
                                what is my name?                               
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec4          │ rec3          │ rec1          │ rec5          │ rec2         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.60397837355 │ 0.57756964179 │ 0.56708705753 │ 0.56642546454 │ 0.5238228503 │
│ 65276         │ 15415         │ 12752         │ 61733         │ 507946       │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
                             what do I do for work?                            
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec4          │ rec3          │ rec1          │ rec5          │ rec2         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.63661711800 │ 0.62127687438 │ 0.59923824071 │ 0.59725856344 │ 0.5503394661 │
│ 78882         │ 7576          │ 91169         │ 73778         │ 753722       │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
──────────────────────────── llama-2-13b.Q4_0.gguf ─────────────────────────────
                                what is my name?                               
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec4          │ rec3          │ rec5          │ rec1          │ rec2         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.59911805798 │ 0.57109881567 │ 0.56804939780 │ 0.56242585396 │ 0.5194026572 │
│ 635           │ 18331         │ 57511         │ 2513          │ 582072       │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
                             what do I do for work?                            
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec4          │ rec3          │ rec5          │ rec1          │ rec2         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.64371360174 │ 0.62165431068 │ 0.60384414156 │ 0.59154748210 │ 0.5519506179 │
│ 84283         │ 41886         │ 33206         │ 02391         │ 57862        │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
───────────────────────────── llama-2-7b.Q6_K.gguf ─────────────────────────────
                                what is my name?                               
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec1          │ rec2          │ rec3          │ rec5          │ rec4         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.18273758637 │ -0.2513613774 │ -0.3654693736 │ -0.3870037852 │ -0.399781303 │
│ 271353        │ 0550846       │ 230258        │ 883792        │ 434732       │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
                             what do I do for work?                            
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec5          │ rec3          │ rec4          │ rec2          │ rec1         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.38100557980 │ 0.26378284664 │ 0.23677382416 │ 0.15605442982 │ 0.1064241094 │
│ 647803        │ 020213        │ 063356        │ 68359         │ 2596608      │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
──────────────────────────── llama-2-7b.Q5_K_S.gguf ────────────────────────────
                                what is my name?                               
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec1          │ rec2          │ rec3          │ rec5          │ rec4         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.21707250428 │ -0.2340426771 │ -0.3586767363 │ -0.3804505754 │ -0.396052113 │
│ 374697        │ 6418634       │ 888217        │ 365624        │ 3365741      │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
                             what do I do for work?                            
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec5          │ rec3          │ rec4          │ rec2          │ rec1         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.41390582241 │ 0.30484629116 │ 0.28162591944 │ 0.17762617968 │ 0.0909184078 │
│ 379037        │ 732204        │ 60438         │ 673447        │ 6347069      │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
───────────────────────────── llama-2-7b.Q4_0.gguf ─────────────────────────────
                                what is my name?                               
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec1          │ rec2          │ rec3          │ rec5          │ rec4         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.25795902034 │ -0.2473741335 │ -0.3636082114 │ -0.3706191111 │ -0.391008143 │
│ 45245         │ 8757243       │ 20225         │ 105214        │ 99201617     │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
                             what do I do for work?                            
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec5          │ rec3          │ rec4          │ rec2          │ rec1         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.35964302792 │ 0.24689123570 │ 0.21876425990 │ 0.13875857455 │ 0.0958155428 │
│ 416625        │ 060933        │ 23262         │ 840124        │ 1466385      │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘

As a comparison, here are the results generated using OpenAI.

──────────────────────────── text-embedding-ada-002 ────────────────────────────
                                what is my name?                               
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec1          │ rec2          │ rec5          │ rec3          │ rec4         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.84728719568 │ 0.77466556646 │ 0.77053716662 │ 0.74477922802 │ 0.7109783032 │
│ 99114         │ 17795         │ 28756         │ 63698         │ 151401       │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘
                             what do I do for work?                            
┌───────────────┬───────────────┬───────────────┬───────────────┬──────────────┐
│ rec2          │ rec1          │ rec3          │ rec5          │ rec4         │
├───────────────┼───────────────┼───────────────┼───────────────┼──────────────┤
│ 0.82378054709 │ 0.76084283155 │ 0.74403286934 │ 0.73862394434 │ 0.7086218716 │
│ 23869         │ 11899         │ 64854         │ 2763          │ 424992       │
└───────────────┴───────────────┴───────────────┴───────────────┴──────────────┘

I am using LLamaSharp.semantic-kernel 0.7.0, and Microsoft.SemanticKernel.Core 1.0.1-beta.Compared to version 0.5.1, there is not much change in terms of logic.
However, @AsakusaRinne I found that the embeddings generated using GPU and CPU are different, but the difference is not significant.

AsakusaRinne · 2023-11-05T13:33:19Z

However, I found that the embeddings generated using GPU and CPU are different, but the difference is not significant.

There're different optimization strategies on cpu and cuda so that slight difference is okay.

@bancroftway @negatron99 Would you like to try it with v0.7.0. I remember that there were some fix in semantic-kernel integration package after v0.5.1.

xbotter · 2023-11-07T11:30:08Z

This is another comparison result from the bge-large-en-v1.5 model.

                        what is my name?
┌───────────┬───────────┬────────────┬────────────┬────────────┐
│ rec1      │ rec5      │ rec2       │ rec3       │ rec4       │
├───────────┼───────────┼────────────┼────────────┼────────────┤
│ 0.6125676 │ 0.4444277 │ 0.44250283 │ 0.40801963 │ 0.31189275 │
└───────────┴───────────┴────────────┴────────────┴────────────┘
                     what do I do for work?
┌────────────┬───────────┬────────────┬────────────┬────────────┐
│ rec2       │ rec3      │ rec5       │ rec1       │ rec4       │
├────────────┼───────────┼────────────┼────────────┼────────────┤
│ 0.61269283 │ 0.4735341 │ 0.45722228 │ 0.42069164 │ 0.39852148 │
└────────────┴───────────┴────────────┴────────────┴────────────┘

The result seems relatively better. But the model is base BERT, unable to convert into a usable format for llama.cpp at the moment. I used bert.cpp for processing.
According to the issue ggerganov/llama.cpp#2872, llama.cpp has startd working. Looking forward to the final completion.

martindevans added the question Further information is requested label Nov 6, 2023

AsakusaRinne mentioned this issue Nov 10, 2023

LLamaSharp.SemanticKernel Abnormal conversation #273

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to get a simple example to work with Llamasharp and Semantic Kernel #186

Unable to get a simple example to work with Llamasharp and Semantic Kernel #186

bancroftway commented Sep 29, 2023 •

edited

Loading

negatron99 commented Oct 3, 2023

bancroftway commented Oct 3, 2023

negatron99 commented Oct 3, 2023

AsakusaRinne commented Nov 3, 2023

xbotter commented Nov 5, 2023

AsakusaRinne commented Nov 5, 2023

xbotter commented Nov 7, 2023

Unable to get a simple example to work with Llamasharp and Semantic Kernel #186

Unable to get a simple example to work with Llamasharp and Semantic Kernel #186

Comments

bancroftway commented Sep 29, 2023 • edited Loading

negatron99 commented Oct 3, 2023

bancroftway commented Oct 3, 2023

negatron99 commented Oct 3, 2023

AsakusaRinne commented Nov 3, 2023

xbotter commented Nov 5, 2023

AsakusaRinne commented Nov 5, 2023

xbotter commented Nov 7, 2023

bancroftway commented Sep 29, 2023 •

edited

Loading