Start Learning for Free

≡

Learn Prompting

Prompt Engineering Guide

😃 Basics

🟢 Introduction to AI

🟢 Prompting With ChatGPT

🟢 Prompt Engineering

🟢 Learn Prompting Embeds

🟢 Giving Instructions

🟢 Assigning Roles

🟢 Few-Shot Prompting

🟢 Parts of a Prompt

🟢 Combining Techniques

🟢 Priming Chatbots

🟢 OpenAI Playground

🟢 LLM Settings

🟢 Pitfalls of LLMs

🟢 Understanding AI Minds

🟢 The Learn Prompting Method

🟢 Starting Your Journey

💼 Applications

🟢 Introduction

🟢 Summarizing Text

🟢 Structuring Data

🟢 Writing An Email

🟢 Different Writing Styles

🟢 Zapier for Emails

🟢 Study Buddy

🟦 Coding Assistance

🟦 Digital Marketing

🟢 Finding Emojis

🟢 Multiple Choice Questions

🟢 Solve Discussion Questions

🟢 How to Build a Chatbot Using LLMs

🟢 Chatbot + Knowledge Base

🧙‍♂️ Intermediate

🟢 Introduction

🟢 Chain-of-Thought Prompting

🟢 Zero-Shot Chain-of-Thought

🟦 Self-Consistency

🟦 Generated Knowledge

🟦 Least-to-Most Prompting

🟦 Dealing With Long Form Content

🟦 Revisiting Roles

🟢 What's in a Prompt?

🧠 Advanced

🟢 Introduction

Zero-Shot

🟢 Introduction

🟢 Emotion Prompting

🟢 Role Prompting

🟢 Re-reading (RE2)

🟢 Rephrase and Respond (RaR)

◆ System 2 Attention (S2A)

Few-Shot

🟢 Introduction

🟢 Self Generated In-Context Learning (SG-ICL)

◆ K-Nearest Neighbor (KNN)

◆◆ Prompt Mining

Thought Generation

🟢 Introduction

🟦 Contrastive Chain-of-Thought

🟦 Automatic Chain of Thought (Auto-CoT)

🟦 Tabular Chain of Thought (Tab-CoT)

🟦 Memory-of-Thought (MoT)

🟦 Active Prompting

🟦 Analogical Prompting

🟦 Complexity-Based Prompting

🟦 Step-Back Prompting

Ensembling

🟢 Introduction

🟦 Mixture of Reasoning Experts (MoRE)

🟦 Consistency-based Self-adaptive Prompting (COSP)

🟦 DiVeRSe (Diverse Verifier on Reasoning Step)

Self-Criticism

🟢 Introduction

🟢 Self-Calibration

🟢 Chain-of-Verification (CoVe)

🟦 Self-Refine

🟦 Cumulative Reasoning

🟦 Reversing Chain-of-Thought (RCoT)

◆ Self-Verification

Decomposition

🟢 Introduction

🟦 Decomposed Prompting

🟦 Plan-and-Solve Prompting

🟦 Program of Thoughts

🟦 Tree of Thoughts

◆ Faithful Chain-of-Thought

◆ Recursion of Thought

◆ Skeleton-of-Thought

🌱 New Techniques

🟢 Introduction

🟦 Self-Harmonized Chain of Thought (ECHO)

🟦 Logic-of-Thought (LoT)

🟦 Code Prompting

🟢 Aligned Chain-of-Thought (AlignedCoT)

◆ End-to-End DAG-Path (EEDP) Prompting

◆ Instance-adaptive Zero-shot Chain-of-Thought Prompting (IAP)

🟦 Narrative-of-Thought (NoT)

👀 For Vision-Language Models (VLMs)

🟢 Prompt Learning

◆ Context Optimization (CoOp)

◆ Conditional Prompt Learning (CoCoOp)

◆ Mixture of Prompt Learning (MoCoOp)

◆ Attention Prompting on Image

🔀 For Multimodal Large Language Models (MLLMs)

◆ Visual Prompting

🤖 Agents

🟢 Introduction

🟦 LLMs Using Tools

🟦 LLMs that Reason and Act

🟦 Code as Reasoning

⚖️ Reliability

🟢 Introduction

🟢 Prompt Debiasing

🟦 Prompt Ensembling

🟦 LLM Self-Evaluation

🟦 Calibrating LLMs

🖼️ Image Prompting

🟢 Introduction

🟢 Style Modifiers

🟢 Quality Boosters

🟢 Repetition

🟢 Weighted Terms

🟢 Fix Deformed Generations

🟢 Midjourney

🔓 Prompt Hacking

🔨 Tooling

💪 Prompt Tuning

🟢 Introduction

Interpretable Soft Prompts

🗂️ RAG

🟢 Introduction

🟦 Multi-Fusion Retrieval Augmented Generation (MoRAG)

🎲 Miscellaneous

🟢 Detecting AI Generated Text

🟢 Introduction

🟢 Detection Trickery

🟢 Music Generation

📝 Language Models

🟢 Introduction

📙 Vocabulary Reference

📚 Bibliography

📦 Prompted Products

🛸 Additional Resources

🔥 Hot Topics

🌱 New Techniques🔀 For Multimodal Large Language Models (MLLMs)◆ Visual Prompting

◆ Visual Prompting

Last updated on October 3, 2024 by Valeriia Kuka

Overview of Visual Prompting

What is "By My Eyes" Approach?

"By My Eyes" is a novel approach for integrating sensor data into Multimodal Large Language Models (MLLMs) by transforming long sequences of sensor data into visual inputs, such as graphs and plots. This method uses visual prompting to guide MLLMs in performing sensory tasks (e.g., human activity recognition, health monitoring) more efficiently and accurately than text-based methods.

Problem with Text-Based Sensor Data

Text-based methods that use raw sensor data in LLM prompts face challenges like:

Long sequences of sensor data increase computational cost (more tokens).
MLLMs struggle with pattern recognition when handling large numeric sequences.
Limited accuracy in sensory tasks (e.g., recognizing motion from accelerometer data).

How Does "By My Eyes" Work?

"By My Eyes" introduces visual prompts to represent sensor data as images (e.g., waveforms, spectrograms), making it easier for MLLMs to interpret. The key innovation is a visualization generator that automatically converts sensor data into optimal visual representations. This reduces token costs and enhances performance across various sensory tasks.

Steps of the method:

Sensor Data Visualization: Instead of feeding raw sensor data as text, the data is visualized (e.g., a plot of accelerometer readings).
Visual Prompt Design: The MLLM receives the visualized data along with task-specific instructions in a prompt to solve sensory tasks.
Visualization Generator: A tool that automatically selects the most appropriate visualization method (e.g., waveform, spectrogram) for each sensory task, ensuring optimal MLLM performance.

Results of "By My Eyes"

"By My Eyes" was tested on nine sensory tasks across four modalities (accelerometer, ECG, EMG, and respiration sensors). The approach consistently outperformed text-based prompts, showing:

Dataset	Modality	Task	Text Prompt Accuracy	Visual Prompt Accuracy	Token Reduction
HHAR	Accelerometer	Human activity recognition	66%	67%	26.2×
PTB-XL	ECG	Arrhythmia detection	73%	80%	3.4×
WESAD	Respiration	Stress detection	48%	61%	49.8×

The visual prompts also led to more efficient use of tokens, allowing MLLMs to handle larger datasets and more complex tasks without sacrificing accuracy.

Conclusion

The "By My Eyes" method provides a cost-effective and performance-boosting solution for handling sensor data in MLLMs. By transforming raw sensor data into visual prompts, it addresses the limitations of text-based approaches, making it easier for LLMs to solve real-world sensory tasks in fields like healthcare, environmental monitoring, and human activity recognition.

Word count: 0

Previous

◆ Attention Prompting on Image

🟢 Introduction

Next

Get AI Certified by Learn Prompting

Copyright © 2024 Learn Prompting.