LLM Settings and Parameters

Core Parameters

Temperature

  • Range: 0.0 to 2.0
  • Purpose: Controls randomness in responses
  • Use Cases:
    • Low (0.0-0.3): Factual, consistent responses
    • Medium (0.4-0.7): Balanced creativity
    • High (0.8-2.0): More creative, varied outputs

Top-p (Nucleus Sampling)

  • Range: 0.0 to 1.0
  • Purpose: Controls response diversity
  • Use Cases:
    • Low (0.1-0.3): Focused, deterministic outputs
    • Medium (0.4-0.7): Natural language generation
    • High (0.8-1.0): More diverse responses

Max Tokens

  • Purpose: Limits response length
  • Considerations:
    • Model context window
    • Input token count
    • Cost optimization
    • Response completeness

Advanced Settings

Frequency Penalty

  • Range: -2.0 to 2.0
  • Purpose: Reduces word repetition
  • Effects:
    • Positive values: Discourage repetition
    • Negative values: Allow repetition
    • Zero: Neutral behavior

Presence Penalty

  • Range: -2.0 to 2.0
  • Purpose: Controls topic diversity
  • Effects:
    • Positive values: Encourage new topics
    • Negative values: Stay on topic
    • Zero: Balanced approach

Stop Sequences

  • Purpose: Define response endpoints
  • Examples:
    • Custom delimiters
    • End markers
    • Special tokens

Context Window Settings

Input Context

  • Token counting
  • Context truncation
  • Document chunking
  • Memory management

Output Context

  • Response formatting
  • Stream handling
  • Token budgeting
  • Completion signals

Best Practices

Parameter Selection

  • Match task requirements
  • Test different combinations
  • Monitor performance
  • Adjust based on feedback

Optimization Tips

  • Balance quality vs cost
  • Consider latency impact
  • Monitor token usage
  • Implement caching

Use Case Examples

Creative Writing

 
{
"temperature": 0.8,
"top_p": 0.9,
"frequency_penalty": 0.3,
"presence_penalty": 0.3
}

Factual Responses

 
{
"temperature": 0.2,
"top_p": 0.1,
"frequency_penalty": 0.0,
"presence_penalty": 0.0
}

Code Generation

{
"temperature": 0.3,
"top_p": 0.2,
"frequency_penalty": 0.0,
"presence_penalty": 0.0
}

Additional Resources

Documentation

Research Papers

Tools