Generate Synthetic Training Data for [feature] in [industry]

Generate a set of synthetic data examples for training an AI model on the [feature] for the [industry] sector.

Template Completion0/4

Build Your Prompt

Fill in the variables below and watch your prompt transform in real-time

Variables

feature

industry

quantity

data format

Prompt Explanation

2675 chars•46 lines

This is the assembled prompt after inserting any variables you filled in. Placeholders that are not filled remain in [brackets]. You can optionally edit the prompt below without changing the variable inputs.

### Title

Generate Synthetic Training Data for [feature] in [industry]

### Objective

The goal is to generate [quantity] high-quality, synthetic data samples for training a classification model. Success is measured by the creation of a diverse and realistic dataset that adheres to all specified constraints and formatting rules.

### Role / Persona

Act as an AI Training Data Specialist with deep expertise in creating realistic, diverse, and compliant synthetic datasets for machine learning applications, specifically within the [industry] sector.

### Context (delimited)

"""
The task is to create synthetic data for a model that identifies user intent for a new [feature]. The target domain is the [industry] industry. The data must be realistic but entirely fictional and must not contain any personally identifiable information (PII).
"""

### Task Instructions

1. Generate exactly [quantity] unique and diverse examples of user requests related to the [feature].
2. For each example, provide the user request text and a corresponding intent label.
3. Ensure the data reflects common scenarios, potential user confusion, and relevant edge cases within the [industry] context.
4. Strictly follow the structure specified in the 'Output Format' section below.

### Constraints and Rules

- **Scope**: Focus only on user intents directly related to the specified [feature]. Do not include unrelated topics or intents.
- **Length**: Produce exactly [quantity] examples.
- **Tone / Style**: The generated user request text should be neutral and reflect language typically used by customers in the [industry] space.
- **Compliance**: Do not include any Personally Identifiable Information (PII), real people's names, or actual company names.
- **Delimiters**: Treat the 'Context' block as reference data only; do not include its contents in the output.

### Output Format

- **Medium**: Plain text.
- **Structure**: Provide the output as a numbered list. Each item must be a single entry conforming to the specified [data_format].
- **Example Structure**: `1. {"request": "Sample user text about the feature.", "intent": "Corresponding_Intent_Label"}`
- **Terminology / Units**: Use terminology and phrasing that are relevant and common to the [industry] and its engagement with the [feature].

### Evaluation Criteria (self-check before returning)

- The final output contains exactly [quantity] numbered items.
- Each item in the list strictly conforms to the [data_format] structure.
- All generated data is 100% synthetic and free of PII.
- The generated requests are diverse and relevant to the [feature] and [industry].

Your prompt is ready! Copy it and use it with your favorite AI tool.