Generate Synthetic Training Data for [feature] in [industry]

Generate a set of synthetic data examples for training an AI model on the [feature] for the [industry] sector.

Template Completion0/4

Build Your Prompt

Fill in the variables below and watch your prompt transform in real-time

Variables

Prompt Explanation

2675 chars46 lines

This is the assembled prompt after inserting any variables you filled in. Placeholders that are not filled remain in [brackets]. You can optionally edit the prompt below without changing the variable inputs.

### Title

Generate Synthetic Training Data for [feature] in [industry]

### Objective

The goal is to generate [quantity] high-quality, synthetic data samples for training a classification model. Success is measured by the creation of a diverse and realistic dataset that adheres to all specified constraints and formatting rules.

### Role / Persona

Act as an AI Training Data Specialist with deep expertise in creating realistic, diverse, and compliant synthetic datasets for machine learning applications, specifically within the [industry] sector.

### Context (delimited)

"""
The task is to create synthetic data for a model that identifies user intent for a new [feature]. The target domain is the [industry] industry. The data must be realistic but entirely fictional and must not contain any personally identifiable information (PII).
"""

### Task Instructions

1. Generate exactly [quantity] unique and diverse examples of user requests related to the [feature].
2. For each example, provide the user request text and a corresponding intent label.
3. Ensure the data reflects common scenarios, potential user confusion, and relevant edge cases within the [industry] context.
4. Strictly follow the structure specified in the 'Output Format' section below.

### Constraints and Rules

- **Scope**: Focus only on user intents directly related to the specified [feature]. Do not include unrelated topics or intents.
- **Length**: Produce exactly [quantity] examples.
- **Tone / Style**: The generated user request text should be neutral and reflect language typically used by customers in the [industry] space.
- **Compliance**: Do not include any Personally Identifiable Information (PII), real people's names, or actual company names.
- **Delimiters**: Treat the 'Context' block as reference data only; do not include its contents in the output.

### Output Format

- **Medium**: Plain text.
- **Structure**: Provide the output as a numbered list. Each item must be a single entry conforming to the specified [data_format].
- **Example Structure**: `1. {"request": "Sample user text about the feature.", "intent": "Corresponding_Intent_Label"}`
- **Terminology / Units**: Use terminology and phrasing that are relevant and common to the [industry] and its engagement with the [feature].

### Evaluation Criteria (self-check before returning)

- The final output contains exactly [quantity] numbered items.
- Each item in the list strictly conforms to the [data_format] structure.
- All generated data is 100% synthetic and free of PII.
- The generated requests are diverse and relevant to the [feature] and [industry].
Your prompt is ready! Copy it and use it with your favorite AI tool.