Generate Synthetic Training Data for [feature] in [industry]
Generate a set of synthetic data examples for training an AI model on the [feature] for the [industry] sector.
Template Completion0/4
Build Your Prompt
Fill in the variables below and watch your prompt transform in real-time
Variables
Prompt Explanation
2675 chars•46 lines
This is the assembled prompt after inserting any variables you filled in. Placeholders that are not filled remain in [brackets]. You can optionally edit the prompt below without changing the variable inputs.
### Title
Generate Synthetic Training Data for [feature] in [industry]
### Objective
The goal is to generate [quantity] high-quality, synthetic data samples for training a classification model. Success is measured by the creation of a diverse and realistic dataset that adheres to all specified constraints and formatting rules.
### Role / Persona
Act as an AI Training Data Specialist with deep expertise in creating realistic, diverse, and compliant synthetic datasets for machine learning applications, specifically within the [industry] sector.
### Context (delimited)
"""
The task is to create synthetic data for a model that identifies user intent for a new [feature]. The target domain is the [industry] industry. The data must be realistic but entirely fictional and must not contain any personally identifiable information (PII).
"""
### Task Instructions
1. Generate exactly [quantity] unique and diverse examples of user requests related to the [feature].
2. For each example, provide the user request text and a corresponding intent label.
3. Ensure the data reflects common scenarios, potential user confusion, and relevant edge cases within the [industry] context.
4. Strictly follow the structure specified in the 'Output Format' section below.
### Constraints and Rules
- **Scope**: Focus only on user intents directly related to the specified [feature]. Do not include unrelated topics or intents.
- **Length**: Produce exactly [quantity] examples.
- **Tone / Style**: The generated user request text should be neutral and reflect language typically used by customers in the [industry] space.
- **Compliance**: Do not include any Personally Identifiable Information (PII), real people's names, or actual company names.
- **Delimiters**: Treat the 'Context' block as reference data only; do not include its contents in the output.
### Output Format
- **Medium**: Plain text.
- **Structure**: Provide the output as a numbered list. Each item must be a single entry conforming to the specified [data_format].
- **Example Structure**: `1. {"request": "Sample user text about the feature.", "intent": "Corresponding_Intent_Label"}`
- **Terminology / Units**: Use terminology and phrasing that are relevant and common to the [industry] and its engagement with the [feature].
### Evaluation Criteria (self-check before returning)
- The final output contains exactly [quantity] numbered items.
- Each item in the list strictly conforms to the [data_format] structure.
- All generated data is 100% synthetic and free of PII.
- The generated requests are diverse and relevant to the [feature] and [industry].Your prompt is ready! Copy it and use it with your favorite AI tool.