Simplify Information Extraction: A Reusable Prompt Template for GPT Models

Author:Murphy | View: 27828 | Time: 2025-03-23 11:50:29

Introduction

If I told you that I created the ultimate prompt template for information extraction tasks that will guarantee you'll get the exact desired performance with incredible recall and precision and guaranteed output formatting every time, you'd probably scoff.

And rightfully so – because nobody can guarantee these checkboxes with the unpredictable nature of LLMs. –melting face emoji–

However, this is what I can say: after extensive work on over a dozen nuanced medical information extraction tasks – each requiring deep domain expertise – I've developed a prompt template that utilizes the prompting techniques that have worked for me to significantly boost performance and minimize erroneous outputs. This template has helped me streamline my workflow, reduce iteration cycles, and bring a reliable level of consistency to my results.

In this article, I'll walk through this template, explain the rationale behind each section, and share the lessons I've learned along the way. My hope is that the trial and error that occurred to gain these insights can be redeemed as time well spent that resulted in this resource to help others facing similar challenges in extracting precise information from complex text data.

With that being said, I'd like to emphasize again that this template is tailored specifically for the task of extracting key pieces of information from text data, particularly in specialized fields like the medical field where high accuracy is critical.

Prompting Techniques

Before I share the template, I'll share some of the key prompting techniques that helped boost performance:

Few-shot prompting

This involves providing the model with a few examples of the desired input-output pairs. By giving the model specific pieces to look for, along with several examples of the desired output format, it better understands the task at hand.

Negative prompting

Explicitly stating what should not be included in the output has been a key in improving precision. This technique reduces the likelihood of irrelevant or erroneous information being extracted.

Chain of Thought Reasoning (CoT)

Encouraging the model to explain its reasoning process step-by-step enhances performance in two ways:

First, the model's explanations can reveal nuances that may require adjustments in the prompt, particularly in the negative prompting section.
Second, CoT allows the model to cross-check each extracted quote against both inclusion and exclusion criteria, leading to more precise and accurate responses by filtering down the initial set of results.

Role prompting

Assigning a specific role or persona to the model helps guide its responses. There is some debate on whether or not role prompting improves performance. However, for my particular tasks, I've experienced this technique to be useful. Many of my projects require a domain expert overseeing a complex process, with each extracted piece fitting into a checklist. When the model adopts the persona of an expert, it gains a broader context for why specific information is being extracted, which can significantly improve the quality of the results.

Now, onto the prompt template!

Prompt Template Structure

This reusable prompt template consists of five key sections:

Summary of task: A clear, concise description of the information extraction task, which often includes role prompting to set the context for the model's responses.
Extraction criteria: Specific guidelines outlining what information should be extracted. This section defines the scope of the task, ensuring the model focuses on the relevant data.
Examples: Demonstrations of correct input-output pairs. For tasks focused on extracting quotes, I typically include a list of example quotes along with examples of the expected output format. I try to include diverse examples that cover various scenarios the model might encounter.
Exclusion criteria / negative examples: Clear guidelines on what should not be included in the extraction in order to prevent the model from extracting irrelevant or incorrect information. I will typically address common mistakes or misinterpretations I've observed in previous attempts.
Output instructions: To ensure the extracted information is presented in a consistent, usable format. Here is where I'll specify the exact format (e.g., JSON, list) and request step-by-step / CoT reasoning.

Example: Medication Extraction from Mental Health Conversations

Here's an example of a filled-out template for extracting information specifically about antidepressant use and any associated side effects from a conversation between a coach and their mentee.

PROMPT_MENTAL_HEALTH_COACHING = """
As a mental health coach specializing in supporting clients with
depression and anxiety, your task is to assess whether the mentee
is currently taking antidepressants and to identify any side effects
they have mentioned during the session. Review the conversation
between the coach and mentee to extract relevant information about
antidepressant use and associated side effects.

CRITERIA FOR EXTRACTION:
Antidepressant Use: Identify quotes where the mentee explicitly 
mentions taking antidepressants. The following is a non-exhaustive
list of common antidepressants you should look for:
  - Selective Serotonin Reuptake Inhibitors (SSRIs): 
    - e.g., Fluoxetine (Prozac), Sertraline (Zoloft), Citalopram (Celexa)
  - Serotonin-Norepinephrine Reuptake Inhibitors (SNRIs): 
    - e.g., Venlafaxine (Effexor), Duloxetine (Cymbalta)
  - Tricyclic Antidepressants (TCAs):
    - e.g., Amitriptyline, Nortriptyline
  - Atypical Antidepressants: 
    - e.g., Bupropion (Wellbutrin), Mirtazapine (Remeron)
  - Monoamine Oxidase Inhibitors (MAOIs):
    - e.g., Phenelzine (Nardil), Tranylcypromine (Parnate)

- Side Effects: Identify quotes where the mentee mentions any side effects
 they believe are related to their antidepressant medication.

EXAMPLES:
1. Input:
   - "I've been on my antidepressants for a few months now, but 
      lately, I've noticed I'm feeling more tired than usual. I'm also 
      still having trouble sleeping. I'm worried these might be side 
      effects of the medication."
   - Output:
     {{
       "antidepressant_use": "I've been on my antidepressants 
        for a few months now.",
       "side_effects": ["feeling more tired", "trouble sleeping"]
     }}

2. Input:
   - "Since I started taking Zoloft, I've noticed I'm feeling less anxious,
     but I'm also getting these headaches that I didn't have before."
   - Output:
     {{
       "antidepressant_use": "Since I started taking Zoloft.",
       "side_effects": ["headaches"]
     }}

3. Input:
   - "I'm not sure if it's the antidepressants, but I've been feeling dizzy
     lately, especially in the mornings."
   - Output:
     {{
       "antidepressant_use": "I'm not sure if it's the antidepressants.",
       "side_effects": ["dizziness"]
     }}

EXCLUSION CRITERIA / NEGATIVE EXAMPLES:
- Do Not Include: General comments about well-being that are not 
  explicitly tied to antidepressant use or side effects (e.g., 
  "I'm feeling okay today" or "Work has been stressful lately").
- Purpose: To focus exclusively on mentions of antidepressant 
  use and specific side effects that the mentee attributes to 
  the medication.

OUTPUT INSTRUCTIONS:
- Format: Provide the extracted information in JSON format.
- Structure: Include the following fields: antidepressant_use and side_effects.
- Detailed Reasoning: Ensure that only quotes directly related 
  to antidepressant use and side effects are included in the output.
- Example Format:
  {{
    "antidepressant_use": "[Extracted Quote about Antidepressant Use]",
    "side_effects": [
      "[Extracted Quote about Side Effect 1]",
      "[Extracted Quote about Side Effect 2]",
      ...
    ]
  }}

Here is the transcript: {transcript}
"""

Iterative Improvement Process

Creating an effective prompt template is an iterative process that requires continuous refinement. After developing an initial version of the template, I follow these steps to ensure it meets the desired performance:

Start with a Basic Prompt: Begin with a simple prompt, avoiding the inclusion of examples or exclusion criteria.
Analyze Output: Review the model's output to identify any errors, inconsistencies, or areas where the extraction falls short.
Incorporate Examples: Add relevant examples to guide the model in producing correct extractions
Introduce Exclusion Criteria: Define clear exclusion criteria to prevent the model from extracting irrelevant or incorrect information.
Apply Chain of Thought Reasoning: Implement Chain of Thought reasoning to improve accuracy and transparency in the model's decision-making process.
Iterate and Fine-Tune: Continuously iterate on steps 2–5, adjusting each section based on the model's performance and any emerging patterns in the errors.

For example, if I notice in the above scenario that the model begins to extract information about other medications – such as mistakenly pulling details about anxiolytics like Xanax or mood stabilizers like Lithium – I would then revise the exclusion criteria to explicitly list these commonly confused medications. This helps ensure that the model focuses solely on antidepressants and their associated side effects, refining the extraction process to be more accurate and relevant.

Limitations and Best Practices

Prompt Length: Extremely long prompts may exceed the model's context window. I've found success in identifying and trimming unimportant sections, reducing the prompt length (and therefore, cost) while ensuring the model remains focused on the most relevant information.
Model-Specific Optimizations: Different models may respond more effectively to specific prompting styles. This prompt has been particularly effective with GPT-3.5 Turbo, GPT-4, and GPT-4 Turbo.
Task Complexity: For highly complex tasks, consider breaking them down into manageable subtasks or employing a multi-step approach to enhance clarity and performance.
Test for Robustness: Evaluate your template across a diverse range of inputs to confirm its reliability and adaptability.

Conclusion

If you've made it this far, you're probably wondering if this prompt template truly lives up to the promise of streamlining your information extraction tasks. The truth is, I have no idea how it'll work on your tasks. However, this template has been pieced together through at least a dozen medical information extraction tasks, it has been battle-tested in medical scenarios where high precision is crucial and it has been a staple in my workflow, consistently delivering reliable results for me.

For others who have complex information extraction tasks, I hope this template and workflow act as a valuable starting place. At the end of the day, this template isn't a one-size-fits-all solution, but a foundation of strategies that you can build upon to meet the unique demands of your tasks. Let me know if you decide to use it and how it works out for you. Additionally, I'd love to hear your thoughts on prompting techniques that have been useful for you as well! Feel free to leave a comment here or send me an email at [email protected].

Tags: ChatGPT Data Science Llm Machine Learning Prompt Engineering