Prompting, the Wonderful Wizard (Part 2)


This is my second post about Large Language Models (LLMs).

In the first part I covered:

  • How it started?
    • In-context-learning
  • How has it evolved, and why is it important?
  • How does it work?
    • Functional approach
    • Cast as a known process
  • When does it not work?
    • Planning
    • Compositional tasks

Now, let’s continue with:

What Are the Risks Involved?

The capabilities of GPT-4 will shift perceptions on tasks that require human effort, potentially leading to the displacement of jobs and broader economic influences. Other implications of the new powers include the enablement of malevolent actors with new tools of disinformation and manipulation. On limitations, deficits in the reliability of the system and in the biases that it learns, can lead to problems given potential over-reliance and poor understanding about when the system fails or will demonstrate bias, potentially amplifying existing societal issues.1

Clearly, there are many risks with LLMs and prompting. One classical problem is the “hallucination” problem, which in simpler terms means providing incorrect information without warning. Another issue is that prompting can be used to “jailbreak” LLMs, resulting in LLMs providing undesirable information.

Hallucinations

ResarchML

We discussed a key limitation of LLMs as their tendency to generate errors without warning, including mathematical, programming, attribution, and higher-level conceptual errors. Such errors are often referred to as hallucinations per their tendency to appear as reasonable or aligned with truthful inferences. Hallucinations, such as erroneous references, content, and statements, may be intertwined with correct information, and presented in a persuasive and confident manner, making their identification difficult without close inspection and effortful fact-checking.1

There are many examples of hallucination. It starts by being funny, but it quickly becomes apparent that it is a paramount problem to solve.

ResarchML

Jailbreaking

We have seen that giving demonstrations helps in many tasks. Exactly the same behavior can be used to “jailbreak” LLMs.

Recent works show that even aligned LLMs are still vulnerable to adversarial attacks, typically called the jailbreak issue of LLMs.2

We investigate a family of simple long-context attacks on large language models: prompting with hundreds of demonstrations of undesirable behavior.3

ResarchML Jailbreaking with many-shot demonstrations, source3

In particular, our results corroborate the role of diversity, and further suggest that given a sufficiently long attack with sufficiently diverse demonstrations, one could potentially construct a “universal” jailbreak.3

These results suggest that larger models might be even more susceptible to Many-Shot Jailbreaking (MSJ) attacks. This is worrying from the perspective of safety: we expect MSJ to be more effective on larger models unless the large language community resolve this vulnerability without otherwise harming model capabilities.3

It sounds a bit worrysome, especially because it’s just in-context learning at scale targeting some vulnerability.

Privacy and security

There is a vast amount of work in understanding different aspects of this area. For example, LLMs are capable of leaking personal information.

ResarchML Different vulnerabilities in LLMs, source4

Furthermore, there are cases where LLMs can not only leak information but also infer personal attributes. ResarchML Privacy violation in LLMs, source5

Our findings highlight that current LLMs can infer personal data at a previously unattainable scale. In the absence of working defenses, we advocate for a broader discussion around LLM privacy implications beyond memorization, striving for a wider privacy protection.5

How to Use It? (TL;DR)

ResarchML

If you’ve made it through the background sections, you might be interested in practical tips on prompting. Here are some valuable resources to get you started:

ResarchML From Natural Laguage Understanding in Youtube https://youtu.be/0mXbM2j3Dzs?si=gDOUDbwFJLIoCRPB

Beyond Prompting

Tweet

Retrieval Augmented Generation (RAG)

While prompting is powerful, it’s essential to consider other aspects, such as how and from where prompts can be generated or retrieved. There is significant interest in the intersection of NLP/LLMs and information retrieval.

ResarchML Combining LLMs and information retrieval, from https://youtu.be/K_Dh0Sxujuc?si=4S1ZWj1SL4qcZsTB

The above figure illustrates different components of a prompt: paragraph, question, answer. These could be generated by LLMs or retrieved from another source. The retrieval could be based on similarity, and the question/query itself could be rewritten by an LLM. The interplay between IR and LLMs is a compelling use case.

Multi-agent LLMs

Another example is using different LLMs in a “multi-agent” approach. Given LLMs planning limitations, many of these works are focusing on multi-agent conversations or “agentic” scenarios.

ResarchML A Multiagent LLM framework, from https://microsoft.github.io/autogen/docs/Use-Cases/agent_chat/

AutoGen2 is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language and computer code can be used to program flexible conversation patterns for different applications. AutoGen serves as a generic framework for building diverse applications of various complexities and LLM capacities.11


My personal take

Wizard

  "I thought Oz was a great Head,” said Dorothy.
  “And I thought Oz was a lovely Lady,” said the Scarecrow.
  “And I thought Oz was a terrible Beast,” said the Tin Woodman.
  “And I thought Oz was a Ball of Fire,” exclaimed the Lion.
  “No, you are all wrong,” said the little man meekly.
  “I have been making believe.”
  “Making believe!” cried Dorothy. “Are you not a Great Wizard?”
  “Hush, my dear,” he said.
  “Don’t speak so loud, or you will be overheard and I should be ruined.
   I’m supposed to be a Great Wizard.”
  “And aren’t you?” she asked.
  “Not a bit of it, my dear; I’m just a common man.”
                                    - The Wonderful Wizard of Oz

Like the Wizard of Oz, many people believe that LLMs have extraordinary powers. Unlike the Wizard of Oz, I don’t think that understanding the limitations and seeing the flaws of LLMs makes them less exciting. For me, it just removes the mystery and magic, making them more manageable —more like a tool that excels at certain tasks. LLMs are not perfect; they appear to have reasoning capabilities, but there is evidence that pattern-matching is at their core. LLMs make it easy to access specific knowledge, but this comes at the cost of correctness. Will every possible application work with LLMs? No. But finding the region/task where an LLM is useful requires certain knowledge and engineering to make them work. It’s not magic; it never was.


Thanks to Anna, Bilal, Jonathan, and Kry for reading earlier versions of this post.

Cite as:

Hernandez-Leal, Pablo. (June 2024). Prompting, the Wonderful Wizard (Part 2), pablohl.github.io, https://pablohl.github.io/posts/prompts2/

  @article{hernandezleal2024prompting2,
    title   = "Prompting, the Wonderful Wizard (Part 2)",
    author  = "Hernandez-Leal, Pablo",
    journal = "pablohl.github.io",
    year    = "2024",
    month   = "June",
    url     = "https://pablohl.github.io/posts/prompts2/"
  }


  1. Bubeck et al. Sparks of Artificial General Intelligence: Early experiments with GPT-4 ↩︎

  2. Wei, Zeming, Yifei Wang, and Yisen Wang. “Jailbreak and guard aligned language models with only few in-context demonstrations.” (2023). ↩︎

  3. Anil, Cem, et al. “Many-shot Jailbreaking.” Anthropic, April (2024). ↩︎

  4. Das, Badhan Chandra, M. Hadi Amini, and Yanzhao Wu. “Security and privacy challenges of large language models: A survey.” (2024). ↩︎

  5. Staab, Robin, et al. “Beyond memorization: Violating privacy via inference with large language models.” (2023). ↩︎

  6. Weng, Lilian. (Mar 2023). Prompt Engineering. Lil’Log. ↩︎

  7. Saravia, Elvis, Prompt Engineering Guide (2022) ↩︎

  8. Schulhoff, Sander, et al. “The Prompt Report: A Systematic Survey of Prompting Techniques."(2024). ↩︎

  9. Yan, Ziyou. (May 2024). Prompting Fundamentals and How to Apply them Effectively. eugeneyan.com. ↩︎

  10. What We Learned from a Year of Building with LLMs ↩︎

  11. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation ↩︎